JP2001034709A

JP2001034709A - Fast recognition and retrieval system, method for accelerating recognition and retrieval used for the same and recording medium with its control program recorded thereon

Info

Publication number: JP2001034709A
Application number: JP11205503A
Authority: JP
Inventors: Noboru Nakajima; 昇中島
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1999-07-21
Filing date: 1999-07-21
Publication date: 2001-02-09
Anticipated expiration: 2019-07-21
Also published as: JP3374793B2

Abstract

PROBLEM TO BE SOLVED: To provide a fast recognition and retrieval system capable of fast executing retrieval in a stable required time without accompanying backtrack. SOLUTION: The subset generating part 13 of a learning processing means 1 duplicates a template existing while spreading over an identification boundary for performing separation to a lower subset and registers the template in a slave node. A hierarchical structure storing part 14 stores characteristics degenerated to a leaf node. An identification processing means 2 reads the contribution ratio of each characteristic quantity from leaf node and performs classification by using only a degenerated characteristic. Because stable and also fast retrieval can be performed without the redundant backtrack of a decision tree in this way, the prior probability of a category is calculated and a frequent character can be determined in a shallow hierarchy of the decision tree, it is possible to further realize the improvement of a retrieval speed.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は高速認識検索システ
ム及びそれに用いる認識検索高速化方法並びにその制御
プログラムを記録した記録媒体に関し、特に多数の子ノ
ードからなるデータ集合から所望の子ノードを抽出する
認識検索処理を高速化する方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a high-speed recognition / retrieval system, a high-speed recognition / retrieval method used therefor, and a recording medium on which a control program is recorded. The present invention relates to a method for speeding up a search process.

【０００２】[0002]

【従来の技術】従来、決定木による認識システムにおい
ては、特徴抽出部と、標本辞書部と、離散型決定木生成
部と、決定木記憶部と、バックトラック決定木探索部と
から構成されている。2. Description of the Related Art Conventionally, a decision tree recognition system includes a feature extraction unit, a sample dictionary unit, a discrete decision tree generation unit, a decision tree storage unit, and a backtrack decision tree search unit. I have.

【０００３】上記の決定木による認識システムでは文字
パターンが入力されると、特徴抽出部によって特徴ベク
トルが生成される。予め離散型決定木生成部によって生
成された決定木は決定木記憶部に記憶されている。In the above-described decision tree recognition system, when a character pattern is input, a feature vector is generated by a feature extracting unit. The decision tree generated in advance by the discrete decision tree generation unit is stored in the decision tree storage unit.

【０００４】特徴ベクトルは決定木の各ノードに記憶さ
れた条件にしたがって識別され、識別結果にしたがった
子ノードが選択されて以降の識別が行われる。同様な分
類が終端ノードに到達するまで繰り返して行われる。[0004] The feature vector is identified according to the conditions stored in each node of the decision tree, and subsequent identification is performed after a child node is selected according to the identification result. Similar classification is repeatedly performed until the terminal node is reached.

【０００５】終端ノードにおいては最終的な標本辞書と
の照合が行われる。この照合を良好に行えない場合、バ
ックトラック決定木探索部は辞書を上位の階層に遡って
検索を行う。上記の決定木による認識システムについて
は、特開平６−２８２６８７号公報に開示されている。[0005] At the end node, the final collation with the sample dictionary is performed. If this collation cannot be performed satisfactorily, the backtracking decision tree search unit searches the dictionary by going back to a higher hierarchy. The above-described recognition system using a decision tree is disclosed in Japanese Patent Application Laid-Open No. 6-282687.

【０００６】[0006]

【発明が解決しようとする課題】上述した従来の決定木
による認識システムでは、上記の方法のようなバックト
ラックのインプリメンテーションを最適に行うことは一
般に困難で、実際の探索効率が改善されないことが多
い。最悪の場合、全探索と同程度まで劣化してしまうこ
とがあるうえに、検索パターンによって検索時間が不安
定になるため、必ずしも検索効率を改善させることがで
きない。In the above-described conventional decision tree recognition system, it is generally difficult to optimally implement the backtrack as in the above method, and the actual search efficiency is not improved. There are many. In the worst case, the search may be deteriorated to the same extent as the full search, and the search time may become unstable depending on the search pattern, so that the search efficiency cannot always be improved.

【０００７】また、階層構造の構築には、例えば特徴空
間内での特定の位置からのユークリッド距離で評価され
るといったような確定的かつ画一的な基準が設けられて
おり、被検索要素の性質にしたがって特定の決定木の構
造修正を行うことは不可能であるため、通常、体感的な
検索効率を改善させるために、被検索頻度の高い要素を
高速に出力する枠組みを準備すべきであるが、上記の方
法ではこれを実現することが不可能である。In the construction of the hierarchical structure, a deterministic and uniform criterion such as, for example, evaluation based on a Euclidean distance from a specific position in the feature space is provided. Since it is impossible to modify the structure of a specific decision tree according to its properties, it is usually necessary to prepare a framework that outputs frequently searched elements at high speed in order to improve perceived search efficiency. However, this is not possible with the above method.

【０００８】そこで、本発明の目的は上記の問題点を解
消し、バックトラックを伴わずに安定な所用時間で高速
に検索を実行できる高速認識検索システム及びそれに用
いる認識検索高速化方法並びにその制御プログラムを記
録した記録媒体を提供することにある。SUMMARY OF THE INVENTION Accordingly, an object of the present invention is to solve the above-mentioned problems and to realize a high-speed recognition search system capable of executing a high-speed search with a stable required time without a backtrack, a recognition search speed-up method used therefor, and control thereof It is to provide a recording medium on which a program is recorded.

【０００９】本発明の他の目的は、対象の出現確率にし
たがって決定木の構造を最適化でき、体感的な検索効率
の改善を行うことができる高速認識検索システム及びそ
れに用いる認識検索高速化方法並びにその制御プログラ
ムを記録した記録媒体を提供することにある。Another object of the present invention is to provide a high-speed recognition search system capable of optimizing the structure of a decision tree in accordance with the appearance probability of an object and improving perceptual search efficiency, and a method for speeding up a recognition search used therefor. Another object of the present invention is to provide a recording medium recording the control program.

【００１０】[0010]

【課題を解決するための手段】本発明による高速認識検
索システムは、入力された文字パターンから特徴ベクト
ルを生成し、前記特徴ベクトルを予め生成された決定木
の各ノードに記憶された条件にしたがって識別し、その
識別結果にしたがって子ノードを順次選択し、この分類
を終端ノードに到達するまで繰り返し行う高速認識検索
システムであって、予め設定された正解カテゴリを付与
してあるパターンの集合から認識辞書に記憶された複数
次元の特徴ベクトルのテンプレートを生成する生成手段
と、前記生成手段で作成されたテンプレートと前記テン
プレートの生成に寄与したパターンとを関連付けて記憶
するテンプレート辞書記憶手段と、現在着目しているテ
ンプレート及び当該テンプレートそれぞれに対応するパ
ターンの集合と正解カテゴリの出現頻度とを部分集合に
分類しかつ前記部分集合に属するテンプレートと前記部
分集合への分離を行うための閾値とを出力する部分集合
生成手段と、前記部分集合生成手段が逐次生成する前記
テンプレートの部分集合を該当する分離前のテンプレー
トの部分集合に関連付けて記憶する階層辞書手段と、前
記階層辞書記憶手段に記憶された階層構造を上位階層か
ら順に入力して入力パターンを分類しかつ分類された結
果の子ノードを出力する決定木分類手段と、前記階層構
造のリーフノードからテンプレートを決定するために効
果的な特徴量を読出してそれらの特徴量を用いて大分類
を行うカテゴリ決定手段とを備えている。SUMMARY OF THE INVENTION A high-speed recognition and retrieval system according to the present invention generates a feature vector from an input character pattern, and stores the feature vector in accordance with a condition stored in each node of a previously generated decision tree. A high-speed recognition and retrieval system for identifying, sequentially selecting child nodes according to the identification result, and repeatedly performing this classification until reaching a terminal node, wherein the recognition is performed from a set of patterns to which a predetermined correct category is assigned. Generating means for generating a template of a multidimensional feature vector stored in a dictionary, template dictionary storing means for storing the template created by the generating means and a pattern contributing to the generation of the template in association with each other, Template and the set of patterns corresponding to each template A subset generation unit that classifies the appearance frequency of the category into a subset and outputs a template belonging to the subset and a threshold value for performing separation into the subset, and the subset generation unit sequentially generates the subset. Hierarchical dictionary means for storing a subset of templates in association with a corresponding subset of pre-separated templates; and inputting the hierarchical structure stored in the hierarchical dictionary storage means in order from the upper level to classify and classify the input pattern. Decision tree classifying means for outputting a child node as a result of the classification, and category determining means for reading out feature amounts effective for determining a template from the leaf nodes of the hierarchical structure and performing a large classification using the feature amounts And

【００１１】本発明による認識検索高速化方法は、入力
された文字パターンから特徴ベクトルを生成し、前記特
徴ベクトルを予め生成された決定木の各ノードに記憶さ
れた条件にしたがって識別し、その識別結果にしたがっ
て子ノードを順次選択し、この分類を終端ノードに到達
するまで繰り返し行う高速認識検索システムの認識検索
高速化方法であって、予め設定された正解カテゴリを付
与してあるパターンの集合から認識辞書に記憶された複
数次元の特徴ベクトルのテンプレートを生成するステッ
プと、現在着目しているテンプレート及び当該テンプレ
ートそれぞれに対応するパターンの集合と前記正解カテ
ゴリの出現頻度とを部分集合に分類しかつ前記部分集合
に属するテンプレートと部分集合への分離を行うための
閾値とを出力するステップと、逐次生成されるテンプレ
ートの部分集合を該当する分離前のテンプレートの部分
集合に関連付けて記憶する階層辞書手段に記憶された階
層構造を上位階層から順に入力して入力パターンを分類
しかつ分類された結果の子ノードを出力するステップ
と、前記階層構造のリーフノードからテンプレートを決
定するために効果的な特徴量を読出してそれらの特徴量
を用いて大分類を行うステップとを備えている。According to the method for speeding up recognition and retrieval according to the present invention, a feature vector is generated from an input character pattern, and the feature vector is identified according to a condition stored in each node of a previously generated decision tree. It is a method for speeding up the recognition and retrieval of a high-speed recognition and retrieval system in which child nodes are sequentially selected according to the result and this classification is repeatedly performed until reaching a terminal node, and is performed from a set of patterns to which a preset correct category is added. Generating a template of a multi-dimensional feature vector stored in the recognition dictionary; classifying a set of a template currently focused on and a pattern corresponding to each of the templates and an appearance frequency of the correct answer category into subsets; Outputs a template belonging to the subset and a threshold for performing separation into subsets Inputting the steps and the hierarchical structure stored in the hierarchical dictionary means for storing the sequentially generated template subsets in association with the corresponding subsets of the template before separation in order from the upper layer, and classifying the input pattern; Outputting a child node as a result of the processing, and a step of reading out feature amounts effective for determining a template from the leaf nodes of the hierarchical structure and performing a large classification using the feature amounts. .

【００１２】本発明による認識検索高速化制御プログラ
ムを記録した記録媒体は、入力された文字パターンから
特徴ベクトルを生成し、前記特徴ベクトルを予め生成さ
れた決定木の各ノードに記憶された条件にしたがって識
別し、その識別結果にしたがって子ノードを順次選択
し、この分類を終端ノードに到達するまで繰り返し行う
認識検索装置における認識検索を高速化するための認識
検索高速化制御プログラムを記録した記録媒体であっ
て、前記認識検索高速化制御プログラムは前記認識検索
装置に、予め設定された正解カテゴリを付与してあるパ
ターンの集合から認識辞書に記憶された複数次元の特徴
ベクトルのテンプレートを生成させ、現在着目している
テンプレート及び当該テンプレートそれぞれに対応する
パターンの集合と前記正解カテゴリの出現頻度とを部分
集合に分類しかつ前記部分集合に属するテンプレートと
部分集合への分離を行うための閾値とを出力させ、逐次
生成されるテンプレートの部分集合を該当する分離前の
テンプレートの部分集合に関連付けて記憶する階層辞書
手段に記憶された階層構造を上位階層から順に入力して
入力パターンを分類しかつ分類された結果の子ノードを
出力させ、前記階層構造のリーフノードからテンプレー
トを決定するために効果的な特徴量を読出してそれらの
特徴量を用いて大分類を行わせている。[0012] The recording medium storing the recognition and retrieval speed-up control program according to the present invention generates a feature vector from an input character pattern, and converts the feature vector to a condition stored in each node of a previously generated decision tree. Therefore, a storage medium storing a recognition search speed-up control program for speeding up a recognition search in a recognition search device that performs identification, sequentially selects child nodes according to the identification result, and repeats this classification until reaching a terminal node. The recognition search acceleration control program causes the recognition search device to generate a template of a multidimensional feature vector stored in a recognition dictionary from a set of patterns to which a preset correct category has been added, A set of templates currently focused on and patterns corresponding to the respective templates, and Classifying the frequency of appearance of the solution category into subsets, and outputting a template belonging to the subset and a threshold value for performing separation into subsets, and converting a subset of sequentially generated templates to the corresponding template before separation. The hierarchical structure stored in the hierarchical dictionary means, which is stored in association with the subset of, is input in order from the upper hierarchy, the input pattern is classified, and the child nodes of the classified result are output. In order to determine the effective amount, a feature amount is read out, and a large classification is performed using the effective amount.

【００１３】すなわち、本発明の高速認識検索システム
は、予め設定された正解カテゴリを付与してあるパター
ンの集合から認識辞書に保存される複数次元の特徴ベク
トルのテンプレートを生成する辞書作成部と、作成され
たテンプレートとテンプレートの生成に寄与したパター
ンとを関連付けて記憶するテンプレート記憶部と、現在
着目しているテンプレート及びテンプレートそれぞれに
対応するパターンの集合を入力してこれらを部分集合に
分類しかつ部分集合に属するテンプレートと部分集合へ
の分離を行うための閾値または識別境界とを出力し、併
せて、もし出力するノードをリーフノードとする場合に
該当する部分集合に属するテンプレートの特徴ベクトル
を縮退して以降の識別に有効な優位な特徴のみを選択し
て出力する部分集合生成部と、部分集合生成部が逐次生
成するテンプレートの部分集合を入力して該当する分離
前のテンプレートの部分集合と関連付けて記憶する階層
構造記憶部とから構成される学習処理部を有している。That is, a high-speed recognition search system according to the present invention comprises: a dictionary creating section for generating a template of a multidimensional feature vector stored in a recognition dictionary from a set of patterns to which a predetermined correct category has been assigned; A template storage unit that stores the created template and the pattern that has contributed to the generation of the template in association with each other, and inputs a set of templates corresponding to the template of interest and the template, classifies them into subsets, and Outputs the template belonging to the subset and the threshold or identification boundary for separating into the subset, and also reduces the feature vector of the template belonging to the corresponding subset if the output node is a leaf node Subset that selects and outputs only the superior features that are effective for subsequent identification A learning processing unit comprising: a generation unit; and a hierarchical structure storage unit that inputs a subset of templates sequentially generated by the subset generation unit and stores the input in association with a corresponding subset of the template before separation. I have.

【００１４】また、本発明の高速認識検索システムは、
この学習処理部の階層構造記憶部に記憶された階層構造
を上位階層から順に入力し、入力パターンを分類し、分
類された結果の子ノードを出力し、もしパターンの子ノ
ードへの分類が最下層まで終了している場合に分類を終
了するパターン分類部と、階層構造を記憶している階層
構造記憶部とから構成される識別処理部を有している。Further, the high-speed recognition search system of the present invention
The hierarchical structure stored in the hierarchical structure storage unit of the learning processing unit is input in order from the upper layer, the input pattern is classified, and the child nodes of the classified result are output. It has an identification processing unit composed of a pattern classification unit that ends the classification when the classification is completed up to the lower layer, and a hierarchical structure storage unit that stores the hierarchical structure.

【００１５】さらに、本発明の高速認識検索システム
は、階層構造のリーフノードから各特徴量の寄与率を読
出して寄与率の低い特徴のみを用いて大分類を行う大分
類部とを有し、分類の経過にしたがって以降の分類を高
速に行えるように動作している。この場合、部分集合生
成部は決定された識別境界に跨って存在するカテゴリを
閾値の両側の部分集合に含めて決定木を生成するため、
冗長な決定木のバックトラックを行わずに安定した検索
時間でパターンの検索が行えるよう動作する。Further, the high-speed recognition and retrieval system of the present invention has a large classification unit which reads a contribution rate of each feature amount from a leaf node having a hierarchical structure and performs a large classification using only features having a low contribution rate. It operates so that subsequent classification can be performed at high speed as the classification progresses. In this case, the subset generation unit generates the decision tree by including the categories existing over the determined identification boundary in the subsets on both sides of the threshold,
It operates so that pattern search can be performed in a stable search time without performing backtracking of redundant decision trees.

【００１６】さらにまた、本発明の高速認識検索システ
ムは、部分集合への識別境界の決定時に、正解カテゴリ
ω_j の事前確率を算出し、これに応じて部分集合として
出力することを特徴とし、頻出する文字を決定木の早い
階層で確定できるように設計されているため、体感的な
検索速度を向上させられるよう動作する。Further, the high-speed recognition and retrieval system of the present invention is characterized in that when determining an identification boundary to a subset, the prior probability of the correct category ω _j is calculated and output as a subset accordingly. Since it is designed so that frequently appearing characters can be determined at the earlier hierarchy of the decision tree, it operates to improve the perceived search speed.

【００１７】[0017]

【発明の実施の形態】次に、本発明の実施例について図
面を参照して説明する。図１は本発明の一実施例による
高速認識検索システムの構成を示すブロック図である。
図１において、本発明の一実施例による高速認識検索シ
ステムは学習処理手段１と、識別処理手段２とから構成
されている。Next, an embodiment of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram showing a configuration of a high-speed recognition search system according to one embodiment of the present invention.
In FIG. 1, the high-speed recognition search system according to one embodiment of the present invention includes a learning processing unit 1 and an identification processing unit 2.

【００１８】学習処理手段１はテンプレート辞書作成部
１１と、テンプレート辞書記憶部１２と、部分集合生成
部１３と、階層辞書記憶部１４とからなり、識別処理手
段２は決定木分類部２１と、カテゴリ決定部２１とから
なる。尚、識別処理手段２は学習処理手段１のテンプレ
ート辞書記憶部１２と階層辞書記憶部１４とを含む。The learning processing means 1 comprises a template dictionary creating section 11, a template dictionary storing section 12, a subset generating section 13, and a hierarchical dictionary storing section 14. The identification processing means 2 comprises a decision tree classifying section 21, And a category determination unit 21. Note that the identification processing unit 2 includes the template dictionary storage unit 12 and the hierarchical dictionary storage unit 14 of the learning processing unit 1.

【００１９】学習処理手段１のテンプレート辞書作成部
１１は予め設定された正解カテゴリ（例えば、「あ」な
らば「あ」のカテゴリ）を付与してあるパターンの集合
から認識辞書に保存される複数次元の特徴ベクトルのテ
ンプレート（参照パターン）を生成する。テンプレート
辞書記憶部１２は作成されたテンプレートとテンプレー
トの生成に寄与したパターンとを関連付けて記憶する。The template dictionary creating section 11 of the learning processing means 1 stores a plurality of patterns stored in the recognition dictionary from a set of patterns to which predetermined correct categories (for example, “A” for “A”) are added. A template (reference pattern) of the dimensional feature vector is generated. The template dictionary storage unit 12 stores the created template in association with the pattern that has contributed to the generation of the template.

【００２０】部分集合生成部１３は着目しているテンプ
レート及びテンプレートそれぞれに対応するパターンの
集合と正解カテゴリの出現頻度とを入力し、これらを部
分集合に分類し、部分集合に属するテンプレートと部分
集合とへの分離を行うための閾値を出力する。階層辞書
記憶部１４は部分集合生成部１３が逐次生成するテンプ
レートの部分集合を入力し、該当する分離前のテンプレ
ートの部分集合と関連付けて記憶する。The subset generating unit 13 receives a template of interest, a set of patterns corresponding to each template, and an appearance frequency of a correct category, classifies them into subsets, and selects a template belonging to the subset and a subset. A threshold value for performing separation into and is output. The hierarchical dictionary storage unit 14 receives the subsets of the templates sequentially generated by the subset generation unit 13 and stores them in association with the corresponding subsets of the template before separation.

【００２１】識別処理手段２の決定木分類部２１はテン
プレート辞書記憶部１２とカテゴリ決定部２２と学習処
理手段１の階層構造記憶部１４に記憶された階層構造を
上位階層から順に入力し、入力パターンを分類し、分類
された結果の子ノードを出力し、もしパターンの子ノー
ドへの分類が最下層まで終了している場合に分類を終了
する。The decision tree classification unit 21 of the identification processing unit 2 sequentially inputs the hierarchical structures stored in the template dictionary storage unit 12, the category determination unit 22, and the hierarchical structure storage unit 14 of the learning processing unit 1 from the upper hierarchy. The pattern is classified, and the child nodes of the classified result are output. If the classification of the pattern into the child nodes is completed to the lowest layer, the classification is ended.

【００２２】カテゴリ決定部２１は階層構造を記憶して
いる階層構造記憶部から、階層構造のリーフノード（末
端ノード）からテンプレートを決定するために効果的な
特徴量を読出し、それらの特徴量を用いて大分類を行
う。The category determining unit 21 reads out effective features for determining a template from the leaf node (terminal node) of the hierarchical structure from the hierarchical structure storage unit that stores the hierarchical structure, and stores those feature amounts. Classification is performed using

【００２３】図２は図１の学習処理手段１の処理動作を
示すフローチャートであり、図３は図１の識別処理手段
２の処理動作を示すフローチャートである。図４〜図７
は本発明の一実施例による高速認識検索システムの処理
動作を説明するための図である。FIG. 2 is a flowchart showing the processing operation of the learning processing means 1 of FIG. 1, and FIG. 3 is a flowchart showing the processing operation of the identification processing means 2 of FIG. 4 to 7
FIG. 3 is a diagram for explaining a processing operation of the high-speed recognition search system according to one embodiment of the present invention.

【００２４】これら図１〜図７を参照して本発明の一実
施例による高速認識検索システムの全体の動作について
説明する。尚、図２及び図３に示す処理動作は学習処理
手段１及び識別処理手段２が図示せぬ制御メモリのプロ
グラムを実行することで実現され、制御メモリとしては
ＲＯＭ（リードオンリメモリ）やＩＣ（集積回路）メモ
リ等が使用可能である。The overall operation of the high-speed recognition and retrieval system according to one embodiment of the present invention will be described with reference to FIGS. The processing operations shown in FIGS. 2 and 3 are realized by the learning processing unit 1 and the identification processing unit 2 executing a program in a control memory (not shown). As the control memory, a ROM (Read Only Memory) or an IC (Read Only Memory) is used. An integrated circuit) memory or the like can be used.

【００２５】学習処理手段１は入力された正解カテゴリ
つきの学習パターンから、テンプレートを生成する（図
２ステップＳ１）。テンプレートの生成方法は、例えば
同一の正解カテゴリを持つパターン（特徴ベクトル）を
平均することで生成する。The learning processing means 1 generates a template from the input learning pattern with the correct answer category (step S1 in FIG. 2). The template is generated by, for example, averaging patterns (feature vectors) having the same correct answer category.

【００２６】学習処理手段１は注目ノードに属するテン
プレートを登録する（図２ステップＳ２）。初回のルー
プの場合、注目ノードはルートノード、それに属するテ
ンプレートは全テンプレートとなる。The learning processing means 1 registers a template belonging to the node of interest (step S2 in FIG. 2). In the case of the first loop, the node of interest is the root node, and the templates belonging to it are all templates.

【００２７】学習処理手段１は注目ノードに属するテン
プレートをクラスタリングによって複数の部分集合に分
類する（図２ステップＳ３）。クラスタリングの方法と
しては、例えば既存のｋ平均アルゴリズム（Tou 、Gonz
alez著、「Pattern Recognition Principles」、Addiso
n-Weisley Publishing Company社刊、p.90）を用いて実
現することができる。The learning processing means 1 classifies templates belonging to the node of interest into a plurality of subsets by clustering (step S3 in FIG. 2). As a clustering method, for example, existing k-means algorithms (Tou, Gonz
alez, Pattern Recognition Principles, Addiso
n-Weisley Publishing Company, p.90).

【００２８】学習処理手段１は生成された部分集合の境
界となる識別面を求める。識別面はより簡単な計算で表
現できたほうが、決定木による識別の高速化に効果があ
る。識別面を、例えば線形の超平面とした場合には、部
分集合の識別面とテンプレートそのものによる識別境界
とが一致しない。The learning processing means 1 obtains an identification plane which is a boundary of the generated subset. Expressing the discrimination plane by simpler calculation is more effective for speeding up discrimination by the decision tree. If the identification surface is, for example, a linear hyperplane, the identification surface of the subset does not match the identification boundary of the template itself.

【００２９】このため、部分集合の識別面付近で、パタ
ーンの識別結果が部分集合の識別面による識別結果と各
テンプレートによる識別結果との間で矛盾が生じる可能
性がある（図４に示す斜線の領域）。この場合、矛盾を
生じた特徴空間内の領域に寄与するテンプレートは該当
する両方の部分集合に含める（図４に示す黒丸のテンプ
レート）。For this reason, in the vicinity of the identification surface of the subset, there is a possibility that the identification result of the pattern is inconsistent between the identification result of the identification surface of the subset and the identification result of each template (the hatched portion shown in FIG. 4). Area). In this case, the template contributing to the region in the feature space where the contradiction occurred is included in both of the corresponding subsets (the black circle template shown in FIG. 4).

【００３０】ここで、部分集合の識別面の求め方は、例
えば部分集合に属するテンプレートの特徴ベクトルの平
均を部分集合の中心とし、部分集合中心間の二等分超平
面を識別面とする方法を用いることができる。Here, a method of obtaining the identification plane of the subset is, for example, a method in which the average of the feature vectors of the templates belonging to the subset is set as the center of the subset, and the bisector hyperplane between the centers of the subset is set as the identification plane. Can be used.

【００３１】学習処理手段１はクラスタリングによって
生成されたテンプレートの部分集合をそれぞれ注目ノー
ドの下位ノードとして登録する（図２ステップＳ４）。
学習処理手段１はこれとあわせて、ステップＳ３で生成
した識別面を注目ノードに対応付けて登録する。登録し
たノードをリーフノードとする条件は、例えば「ノード
に属するテンプレート数が規定値未満になるまで減少し
た場合」というように設定することができる。The learning processing means 1 registers the subsets of the templates generated by the clustering as lower nodes of the target node (step S4 in FIG. 2).
At the same time, the learning processing means 1 registers the identification surface generated in step S3 in association with the node of interest. The condition that the registered node is set as a leaf node can be set, for example, such as “if the number of templates belonging to the node decreases until it becomes less than a specified value”.

【００３２】学習処理手段１は登録したノードがリーフ
ノードの条件を満たしているか否かの判定を行い、リー
フノードでないと判定された場合にはステップＳ６へ、
リーフノードであると判定された場合にはステップＳ７
へ移動する（図２ステップＳ５）。The learning processing means 1 determines whether or not the registered node satisfies the condition of a leaf node. If it is determined that the registered node is not a leaf node, the process proceeds to step S6.
If it is determined that the node is a leaf node, step S7
(Step S5 in FIG. 2).

【００３３】学習処理手段１は登録された子ノードを、
さらに子ノードへと分割される注目ノードとして更新
し、上述したステップＳ２以下の処理を再帰的に適用す
る（図２ステップＳ６）。The learning processing means 1 stores the registered child nodes,
Further, it is updated as a node of interest divided into child nodes, and the above-described processing of step S2 and subsequent steps are recursively applied (step S6 in FIG. 2).

【００３４】学習処理手段１はリーフノードとして登録
されたノードに属するテンプレートに対して、テンプレ
ートを分類するのに有効な特徴を選択する。選択の方法
としては、例えば該当するテンプレートの特徴ベクトル
の主成分分析を行った際の第１〜第ｎ主成分とする。こ
のようにして、選択した特徴成分をリーフノードに対応
付けて記憶する（図２ステップＳ７）。The learning processing means 1 selects, for the templates belonging to the nodes registered as leaf nodes, features effective for classifying the templates. As a selection method, for example, the first to n-th principal components when the principal component analysis of the feature vector of the corresponding template is performed. In this way, the selected feature component is stored in association with the leaf node (step S7 in FIG. 2).

【００３５】学習処理手段１は全リーフノードの登録が
終了したら上記の動作を終了し、終了していなかった
ら、ステップＳ９の処理に移る（図２ステップＳ８）。
すなわち、学習処理手段１はリーフノードがまだ登録さ
れていないノードをサーチし、注目ノードをそのノード
に移して上記のステップＳ２以降の処理を継続する（図
２ステップＳ９）。When the registration of all the leaf nodes is completed, the learning processing means 1 terminates the above operation, and when not completed, proceeds to the processing of step S9 (step S8 in FIG. 2).
That is, the learning processing means 1 searches for a node for which a leaf node has not been registered, moves the node of interest to that node, and continues the processing from step S2 onward (step S9 in FIG. 2).

【００３６】一方、識別処理手段２は入力パターンの特
徴ベクトルが、注目ノードに保存してある識別面と比較
して下位のノードのいずれに分類されるかを決定する
（図３ステップＳ１１）。上記のように、識別の方法に
線形の超平面である識別面を用いる場合には、入力され
た特徴ベクトルが識別面のどちら側に存在するかで識別
を行う。On the other hand, the identification processing means 2 compares the feature vector of the input pattern with the identification plane stored in the target node and determines which of the lower nodes is to be classified (step S11 in FIG. 3). As described above, when an identification plane that is a linear hyperplane is used for the identification method, the identification is performed depending on which side of the identification plane the input feature vector exists.

【００３７】ステップＳ１１で決定された下位ノードが
リーフノードであった場合にはステップＳ１３へ、それ
以外の場合には下位ノードを注目ノードに置き換えて上
記のステップＳ１１以降の処理を再帰的に適用する（図
３ステップＳ１２）。If the lower node determined in step S11 is a leaf node, the process proceeds to step S13. Otherwise, the lower node is replaced with the node of interest, and the processing from step S11 is applied recursively. (Step S12 in FIG. 3).

【００３８】識別処理手段２はリーフノードと対応付け
て記憶されている選択された特徴の要素を読出す（図３
ステップＳ１３）。識別処理手段２は選択された特徴の
要素を用いて、入力パターンとリーフノード以下のテン
プレートとのマッチングを行い、各テンプレートに属す
る距離値を出力する（図３ステップＳ１４）。最後に、
識別処理手段２は距離値の最小となるテンプレートのカ
テゴリを認識結果として出力する（図３ステップＳ１
５）。The identification processing means 2 reads the selected feature element stored in association with the leaf node (FIG. 3).
Step S13). The identification processing unit 2 performs matching between the input pattern and the template below the leaf node using the selected feature element, and outputs a distance value belonging to each template (step S14 in FIG. 3). Finally,
The identification processing means 2 outputs the template category having the minimum distance value as the recognition result (step S1 in FIG. 3).
5).

【００３９】本発明の一実施例では決定木によって決定
されるリーフノードの部分集合応じて、以降の分類方法
が最適化されているため、安定かつ高速化な認識検索を
行うことができる。また、本発明の一実施例はクラスタ
リング時の境界面に矛盾が生じた場合にテンプレートを
両方のノードに含めて登録することで、認識時の煩雑な
バックトラックを防ぐことが可能となり、どのテンプレ
ートに対しても一定の検索時間で検索結果を呈示するこ
とができる。In the embodiment of the present invention, since the subsequent classification method is optimized according to the subset of leaf nodes determined by the decision tree, a stable and high-speed recognition search can be performed. Further, according to an embodiment of the present invention, when inconsistency occurs at the boundary surface during clustering, a template is included in both nodes and registered, so that it is possible to prevent a complicated backtrack at the time of recognition. , A search result can be presented in a fixed search time.

【００４０】次に、文字の特徴パターンから二分木によ
って正解カテゴリを検索する例を用いて説明する。図５
に示すように、２次元の特徴空間にテンプレートが分布
しているとする。また、リーフノードに属するテンプレ
ート数は簡単のため、２未満とする。２次元の場合、部
分集合の識別面は直線となる。Next, a description will be given of an example in which a correct category is searched from a character feature pattern by a binary tree. FIG.
It is assumed that the template is distributed in a two-dimensional feature space as shown in FIG. The number of templates belonging to a leaf node is set to less than 2 for simplicity. In the case of two dimensions, the identification plane of the subset is a straight line.

【００４１】上記のアルゴリズムにしたがって識別面を
生成した結果は、図６に示すようになる。図６では識別
面の太さを違えて表示しているが、太→細の順序に部分
集合の識別面が生成されたことを示している。FIG. 6 shows the result of generating an identification plane according to the above algorithm. In FIG. 6, the identification planes are displayed with different thicknesses, but it indicates that the identification planes of the subset are generated in the order of thick → thin.

【００４２】これに対応する二分木は、図７に示すよう
になる。部分集合の識別面に跨って存在するテンプレー
トは、図７の「あ」や「リ」のように二分木上に重複し
て登録されている。The corresponding binary tree is as shown in FIG. The templates existing across the identification planes of the subset are registered in duplicate on the binary tree, such as “A” and “R” in FIG.

【００４３】このため、認識時にはバックトラックを行
うことなく、１回の縦型探索でテンプレートのカテゴリ
を特定することが可能となる。また、リーフノードでは
２つのテンプレートを最適に識別するための特徴が選択
されているため、計算効率を削減することができる。For this reason, the category of the template can be specified by one vertical search without performing backtracking at the time of recognition. Further, since the feature for optimally identifying the two templates is selected in the leaf node, the calculation efficiency can be reduced.

【００４４】図６及び図７に示す場合と異なり、実際に
は特徴ベクトルが多次元から構成されるため、二分木に
よる識別の過程で既に用いられかつリーフノードに属す
るテンプレートのマッチングを行うのに不要となった特
徴ベクトルの要素を除外することは、マッチングの効率
をあげるのに有効である。Unlike the cases shown in FIGS. 6 and 7, since the feature vector is actually composed of multi-dimensions, it is necessary to match templates already used in the process of identification by the binary tree and belonging to leaf nodes. Excluding unnecessary feature vector elements is effective for increasing the efficiency of matching.

【００４５】図８は本発明の他の実施例による高速認識
検索システムの構成を示すブロック図である。図８にお
いて、本発明の他の実施例による高速認識検索システム
は学習処理手段３にカテゴリ出現頻度計測部１５を追加
した以外は、図１に示す本発明の一実施例による高速認
識検索システムと同様の構成となっており、同一構成要
素には同一符号を付してある。また、同一構成要素の動
作は本発明の一実施例による高速認識検索システムと同
様である。FIG. 8 is a block diagram showing a configuration of a high-speed recognition search system according to another embodiment of the present invention. 8, a high-speed recognition and retrieval system according to another embodiment of the present invention shown in FIG. 1 differs from the high-speed recognition and retrieval system according to one embodiment of the present invention shown in FIG. It has the same configuration, and the same components are denoted by the same reference numerals. The operation of the same components is the same as that of the high-speed recognition and retrieval system according to the embodiment of the present invention.

【００４６】カテゴリ出現頻度計測部１５は認識検索対
象となるデータの集合に対して、カテゴリの出現の事前
確率を計測し、これを部分集合生成部１３に出力する。
部分集合生成部１３はカテゴリ出現頻度頻度計測部１５
の出力するカテゴリの出現頻度を考慮してテンプレート
の部分集合を生成する。The category appearance frequency measurement unit 15 measures the prior probability of the appearance of a category for a set of data to be recognized and searched, and outputs this to the subset generation unit 13.
The subset generation unit 13 includes a category appearance frequency measurement unit 15.
A subset of the template is generated in consideration of the frequency of appearance of the category output by.

【００４７】例えば、注目ノード内に頻出するカテゴリ
のテンプレートが含まれている場合には出現頻度にした
がって、該当するテンプレートを含んだ部分集合のサイ
ズを小さくするように修正する。これによって、頻出す
るテンプレートを早期にリーフノードにすることが可能
となる。For example, if a template of a frequently occurring category is included in the node of interest, the size of the subset including the corresponding template is modified in accordance with the appearance frequency so as to be reduced. As a result, a frequently occurring template can be made a leaf node at an early stage.

【００４８】頻出するテンプレートが上位の階層でリー
フノードとなっていると、リーフノードに到達するまで
に要する計算量を削減することができ、頻出するカテゴ
リを速く検索することが可能となる。ここで、部分集合
サイズの制御方法としては、例えば下位の部分集合を生
成する際に、識別面を二等分超平面とせずに、出現頻度
の大きい部分集合側に識別面を平行に移動する。If a frequently occurring template is a leaf node in a higher hierarchy, the amount of calculation required to reach the leaf node can be reduced, and a frequently occurring category can be searched quickly. Here, as a method of controlling the subset size, for example, when generating a lower subset, the identification plane is moved in parallel to the subset with a higher appearance frequency without making the identification plane into a bisecting hyperplane. .

【００４９】次に、注目ノードＧ₀ から、部分集合Ｇ
₁ ，Ｇ₂ を生成する場合について説明する。従来、出現
頻度を考慮しない場合（部分集合Ｇ₁ ，Ｇ₂ に属するカ
テゴリω_j の出現頻度が等しい場合）、（ｘ₁ −ｘ₂ ）ｘ−（‖ｘ₁ ‖² −‖ｘ₂ ‖² ）／２＝
０となる。ここで、ｘ₁ ，ｘ₂ ，ｘはベクトルである。Next, from the target node G ₀ , the subset G
A case where ₁ and G ₂ are generated will be described. Conventionally, when the appearance frequency is not considered (when the appearance frequencies of the categories ω _j belonging to the subsets G ₁ and G ₂ are equal), (x ₁ −x ₂ ) x− (‖x ₁ ‖ ² −‖x ₂ ‖ ² ) / 2 =
It becomes 0. Here, x ₁ , x ₂ , and x are vectors.

【００５０】各部分集合の各カテゴリω_j の出現頻度が
大きい部分集合の方向に識別面を平行移動することを考
えると、（ｘ₁ −ｘ₂ ）ｘ−｛（Ａ＋１）（ｘ₁ ）² −（２Ａ＋
１）ｘ₁ ｘ₂ ＋Ａ（ｘ₂ ）² ｝＝０となる。ここで、Considering that the identification plane is translated in the direction of the subset in which the appearance frequency of each category ω _j of each subset is high, (x ₁ −x ₂ ) x − ｛(A + 1) (x ₁ ) ² − (2A +
1) x ₁ x ₂ + A (x ₂ ) ² ｝ = 0. here,

【数１】であり、定数ｋは０＜ｋ≦１をとり、出現頻度比の１／
２からのずれに応じて変化させる識別面の移動量を制御
するパラメータである。(Equation 1) And the constant k is 0 <k ≦ 1, and the constant k is 1 / of the appearance frequency ratio.
This is a parameter for controlling the amount of movement of the identification surface that is changed according to the deviation from 2.

【００５１】学習処理手段１のその他の処理部及び識別
処理手段２に関しては、本発明の一実施例による高速認
識検索システムと同様に動作するため、それらの動作の
説明は省略する。The other processing units of the learning processing means 1 and the identification processing means 2 operate in the same manner as the high-speed recognition and retrieval system according to the embodiment of the present invention, and the description of these operations will be omitted.

【００５２】本発明の他の実施例では部分集合に属する
カテゴリの出現頻度が大きい場合に、その出現頻度の大
きさに応じて部分集合の特徴空間内での大きさを制御す
るように構成しているため、出現頻度が大きいカテゴリ
を決定木の浅い階層でリーフノードにすることができ
る。このため、出現頻度が高いカテゴリのテンプレート
は少ない計算量でカテゴリを特定することができるた
め、検索の体感的な高速化を実現することができる。In another embodiment of the present invention, when the frequency of appearance of a category belonging to a subset is high, the size of the subset in the feature space is controlled in accordance with the frequency of appearance. Therefore, a category having a high appearance frequency can be made a leaf node in a shallow hierarchy of the decision tree. For this reason, the category of a template having a high frequency of appearance can specify the category with a small amount of calculation, so that it is possible to realize a speedy search.

【００５３】例えば、文字認識の場合、ひらがなが文書
の大半を占めるが、対象文書における文字種の頻度分布
が既知であれば、それに応じた決定木を構築することが
できる。認識時に、ひらがなは決定木の浅い階層で分類
が終了するように記録されているので、文書全体にわた
る文字の認識を高速に行うことができる。For example, in the case of character recognition, Hiragana occupies most of the document, but if the frequency distribution of the character type in the target document is known, a decision tree corresponding to the distribution can be constructed. At the time of recognition, Hiragana is recorded so that the classification ends at the shallow hierarchy of the decision tree, so that the character recognition over the entire document can be performed at high speed.

【００５４】このように、決定木のリーフノードに属す
るテンプレートの分布に応じて、以降のカテゴリを特定
するための分類方法を最適化することによって、高速な
認識検索を行うことができる。As described above, a high-speed recognition search can be performed by optimizing a classification method for specifying subsequent categories according to the distribution of templates belonging to leaf nodes of a decision tree.

【００５５】また、決定木の生成を行う際に、部分集合
の境界面に跨って存在するテンプレートを両方のノード
に含めて登録することによって、認識時の煩雑なバック
トラックを防ぐことができるので、どのテンプレートに
対しても安定した検索時間で検索結果を呈示することが
できる。Further, when a decision tree is generated, a template existing over a boundary surface of a subset is included in both nodes and registered, so that complicated backtracking at the time of recognition can be prevented. In addition, search results can be presented for any template in a stable search time.

【００５６】さらに、部分集合に属するカテゴリの出現
頻度が大きい場合に、その出現頻度の大きさに応じて部
分集合の特徴空間内での大きさを制御することによっ
て、出現頻度が大きいカテゴリを決定木の浅い階層でリ
ーフノードにし、出現頻度が高いカテゴリのテンプレー
トが少ない計算量でカテゴリを特定することができるの
で、対象のカテゴリの出現頻度に応じてさらなる体感的
な検索高速化を実現することができる。Further, when the frequency of appearance of the category belonging to the subset is high, the category having the high appearance frequency is determined by controlling the size of the subset in the feature space according to the frequency of appearance. Leaf nodes at shallow hierarchies in the tree make it possible to specify a category with a high frequency of occurrence for templates with a small amount of calculation, so that a more sensible search speed can be realized in accordance with the frequency of occurrence of the target category. Can be.

【００５７】[0057]

【発明の効果】以上説明したように本発明の高速認識検
索システムによれば、決定木のリーフノードに属するテ
ンプレートの分布に応じて、以降のカテゴリを特定する
ための分類方法を最適化し、決定木の生成を行う際に、
部分集合の境界面に跨って存在するテンプレートを両方
のノードに含めて登録することによって、バックトラッ
クを伴わずに安定な所用時間で高速に検索を実行できる
という効果がある。As described above, according to the high-speed recognition search system of the present invention, the classification method for specifying subsequent categories is optimized and determined according to the distribution of templates belonging to leaf nodes of a decision tree. When generating a tree,
By registering the template existing over the boundary surface of the subset in both nodes and registering it, there is an effect that a high-speed search can be performed in a stable required time without backtracking.

【００５８】また、本発明の他の高速認識検索システム
によれば、部分集合に属するカテゴリの出現頻度が大き
い場合に、その出現頻度の大きさに応じて部分集合の特
徴空間内での大きさを制御することによって、対象の出
現確率にしたがって決定木の構造を最適化でき、体感的
な検索効率の改善を行うことができるという効果があ
る。According to another high-speed recognition and retrieval system of the present invention, when the frequency of appearance of a category belonging to a subset is large, the size of the subset in the feature space according to the magnitude of the appearance frequency Is controlled, the structure of the decision tree can be optimized according to the appearance probability of the object, and the effect of improving the perceptual search efficiency can be obtained.

[Brief description of the drawings]

【図１】本発明の一実施例による高速認識検索システム
の構成を示すブロック図である。FIG. 1 is a block diagram illustrating a configuration of a high-speed recognition search system according to an embodiment of the present invention.

【図２】図１の学習処理手段の処理動作を示すフローチ
ャートである。FIG. 2 is a flowchart illustrating a processing operation of a learning processing unit in FIG. 1;

【図３】図１の識別処理手段の処理動作を示すフローチ
ャートである。FIG. 3 is a flowchart showing a processing operation of an identification processing unit in FIG. 1;

【図４】本発明の一実施例による高速認識検索システム
の処理動作を説明するための図である。FIG. 4 is a diagram for explaining a processing operation of the high-speed recognition search system according to one embodiment of the present invention;

【図５】本発明の一実施例による高速認識検索システム
の処理動作を説明するための図である。FIG. 5 is a diagram for explaining a processing operation of the high-speed recognition search system according to one embodiment of the present invention;

【図６】本発明の一実施例による高速認識検索システム
の処理動作を説明するための図である。FIG. 6 is a diagram for explaining a processing operation of the high-speed recognition search system according to one embodiment of the present invention;

【図７】本発明の一実施例による高速認識検索システム
の処理動作を説明するための図である。FIG. 7 is a diagram for explaining a processing operation of the high-speed recognition search system according to one embodiment of the present invention;

【図８】本発明の他の実施例による高速認識検索システ
ムの構成を示すブロック図である。FIG. 8 is a block diagram showing a configuration of a high-speed recognition search system according to another embodiment of the present invention.

[Explanation of symbols]

１学習処理手段２識別処理手段１１テンプレート辞書作成部１２テンプレート辞書記憶部１３部分集合生成部１４階層辞書記憶部１５カテゴリ出現頻度計測部２１決定木分類部２２カテゴリ決定部 Reference Signs List 1 learning processing means 2 identification processing means 11 template dictionary creation unit 12 template dictionary storage unit 13 subset generation unit 14 hierarchical dictionary storage unit 15 category appearance frequency measurement unit 21 decision tree classification unit 22 category determination unit

Claims

[Claims]

1. A feature vector is generated from an input character pattern, the feature vector is identified according to a condition stored in each node of a previously generated decision tree, and child nodes are sequentially selected according to the identification result. A high-speed recognition and retrieval system that repeatedly performs this classification until reaching a terminal node. The system includes a template of a multidimensional feature vector stored in a recognition dictionary from a set of patterns to which a predetermined correct category is assigned. Generating means for generating; a template dictionary storing means for storing a template created by the generating means and a pattern contributing to the generation of the template in association with each other; a template currently focused on and a pattern corresponding to the template. Classifying the set and the occurrence frequency of the correct category into subsets and A subset generation unit that outputs a template belonging to the subset and a threshold value for performing separation into the subset, and a subset of the template that is sequentially generated by the subset generation unit. Hierarchical dictionary means for storing the hierarchical structure stored in the hierarchical dictionary storage means in association with the subset, and classifying an input pattern by sequentially inputting the hierarchical structure stored in the hierarchical dictionary storage means from an upper hierarchy, and outputting a child node of the classified result Means for reading a feature amount effective for determining a template from the leaf node of the hierarchical structure, and a category determining means for performing a large classification using the feature amount, .

2. The method according to claim 1, wherein the subset generating means performs feature reduction of a template included in the subset when the subset is a leaf node, and selects and outputs only valid feature components. 2. The high-speed recognition search system according to claim 1, wherein:

3. The method according to claim 1, wherein the decision tree classifying unit is configured to end the classification when the classification of the pattern into child nodes is completed to the lowest layer.
Or the high-speed recognition search system according to claim 2.

4. The method according to claim 1, wherein the subset generation unit is configured to generate a decision tree by including categories existing over the determined threshold in subsets on both sides of the threshold. Item 4. A high-speed recognition and retrieval system according to any one of Items 3.

5. A category appearance frequency measuring unit for measuring the appearance frequency of the correct category and outputting the appearance frequency of the correct category to the subset generating unit, wherein the subset generating unit includes: The high-speed recognition search according to any one of claims 1 to 4, wherein the classification into subsets is configured to output a threshold value for controlling a boundary for classification according to the prior probability of the correct category. system.

6. A feature vector is generated from an input character pattern, the feature vector is identified according to a condition stored in each node of a previously generated decision tree, and child nodes are sequentially selected according to the identification result. A high-speed recognition / searching method for a high-speed recognition / searching system that repeatedly performs this classification until reaching a terminal node, wherein a plurality of dimensions stored in a recognition dictionary are obtained from a set of patterns to which a predetermined correct category is added. Generating a template of the feature vector of the template, classifying a set of the template currently focused on and the pattern corresponding to each of the templates and the appearance frequency of the correct answer category into subsets, and Outputting a threshold value for performing separation into subsets;
A hierarchical structure stored in a hierarchical dictionary means for storing a subset of sequentially generated templates in association with a corresponding subset of a pre-separated template and inputting the hierarchical structure in order from the upper hierarchy, and classifying the input pattern and classifying the result And a step of reading out effective feature amounts for determining a template from the leaf nodes of the hierarchical structure and performing a large classification using those feature amounts. How to speed up recognition search.

7. A step of outputting a threshold value for separating a template belonging to the subset from a subset, wherein the feature degeneration of a template included in the subset when the subset is a leaf node. 7. The method according to claim 6, wherein only effective feature components are selected and output.

8. The step of outputting the child node as a result of the classification is such that when the classification of the pattern into child nodes has been completed to the lowest layer, the classification is ended. A method for accelerating a recognition search according to claim 6 or 7.

9. The step of outputting a threshold value for performing separation into a template and a subset belonging to the subset is performed by including categories existing over the determined threshold in subsets on both sides of the threshold. 9. The method according to claim 6, wherein a tree is generated.

10. A method for measuring the frequency of appearance of the correct category and outputting the frequency of appearance of the correct category to the subset generating means, for separating the subset belonging to the template from the subset. The step of outputting a threshold value of the template set is output as a threshold value for controlling a boundary for classification in accordance with a prior probability of the correct answer category in classifying the template set into the subset. The method for accelerating recognition search according to any one of claims 6 to 9.

11. A feature vector is generated from an input character pattern, the feature vector is identified according to a condition stored in each node of a previously generated decision tree, and child nodes are sequentially selected according to the identification result. A storage medium storing a recognition search speed-up control program for speeding up the recognition search in the recognition search device that repeatedly performs this classification until reaching the terminal node; A search device is configured to generate a template of a multidimensional feature vector stored in a recognition dictionary from a set of patterns to which a predetermined correct category has been assigned, and to select a template of interest and a pattern corresponding to each of the templates. And the frequency of appearance of the correct answer category are classified into subsets, and A hierarchical structure stored in a hierarchical dictionary means for outputting a template to which the template belongs and a threshold value for performing separation into subsets, and storing the sequentially generated template subset in association with a corresponding subset of the template before separation. Are input in order from the upper hierarchy to classify the input pattern and output the child nodes as a result of the classification, read out effective feature amounts for determining a template from the leaf nodes of the hierarchical structure, and read those feature amounts. A recording medium on which a recognition and retrieval speed-up control program is recorded, wherein a large classification is performed by using a computer.