JP2000293635A

JP2000293635A - Character recognizing device and its method and character recognition program recording medium

Info

Publication number: JP2000293635A
Application number: JP11102584A
Authority: JP
Inventors: Chiaki Sakaki; 千秋榊
Original assignee: NEC Engineering Ltd
Current assignee: NEC Engineering Ltd
Priority date: 1999-04-09
Filing date: 1999-04-09
Publication date: 2000-10-20

Abstract

PROBLEM TO BE SOLVED: To shorten a time required for the calculation of cumulative similarity, and to quickly obtain a recognized result without deteriorating recognizing precision in a character recognizing device which recognizes character pattern data in which data for each character of one word data are subdivided as data for one word. SOLUTION: In this character recognizing method, the information of a word dictionary storing part 13 and a character pattern storing part 12 is successively read, and when the compared result of the number of valid word characters stored in the storing part 12 with the number of word characters successively read from the word dictionary is not within the number of character range, the next word is read. Then, cumulative similarity indicting whether or not the word is similar to each word of the word dictionary is calculated from the number of each character connection of each character pattern and the similarity to each character of the character pattern stored in the storing part 12 and the compared result of the comparing part 15. This cumulative similarity is compared with the previously calculated one, and whether or not this should be stored as a word candidate is judged, and then stored. The stored cumulative similarity is rearranged in order.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は文字認識装置及び文
字認識方法並びに文字認識プログラム記録媒体に関し、
１単語単位の単語データの各１文字分が細分化された文
字パターンデータを１単語分として認識する文字認識装
置及び文字認識方法並びに文字認識プログラム記録媒体
に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character recognition device, a character recognition method, and a character recognition program recording medium.
The present invention relates to a character recognizing device and a character recognizing method for recognizing character pattern data in which each character of word data of one word is subdivided as one word, and a character recognition program recording medium.

【０００２】[0002]

【従来の技術】従来より開発されている文字認識装置で
は、１単語データの各１文字分が細分化された文字パタ
ーンデータから様々な認識手段を用いて１文字分を認識
し、更に１単語を認識することで、単語辞書内のどの単
語ともっとも類似しているか判断し、その結果を出力し
ている。この場合、認識精度を向上させるために複雑な
認識処理をすることにより処理時間が増大し、認識処理
速度が低下してしまうという問題がある。2. Description of the Related Art Conventionally developed character recognition apparatuses recognize one character by using various recognition means from character pattern data in which each character of one word data is subdivided, and further recognize one character. , It is determined which word in the word dictionary is most similar, and the result is output. In this case, there is a problem that the processing time is increased by performing complicated recognition processing to improve the recognition accuracy, and the recognition processing speed is reduced.

【０００３】このような問題を解決する従来の文字認識
装置の１つとして、例えば、特開平７−２３９９１７号
公報に記載されているような文字認識装置がある。同公
報に記載されている文字認識装置について図３を参照し
て説明する。[0003] As one of conventional character recognition devices for solving such a problem, there is a character recognition device as described in, for example, Japanese Patent Application Laid-Open No. 7-239917. The character recognition device described in the publication will be described with reference to FIG.

【０００４】同図において、従来の文字認識装置は、文
字画像から１文字ごとの文字画像を切出す文字切出し部
１と、文字画像同士を比較する文字画像比較部２と、文
字認識を行う文字認識部３と、文字画像を記憶する文字
画像記憶部４と、各部を制御する制御部６と、文字属性
同士を比較して類似度を計算する文字属性比較部７とを
含んで構成されている。文字画像記憶部４内の記憶単位
５は、文字属性、文字画像に関する情報、文字コード及
び頻度情報からなるものとする。In FIG. 1, a conventional character recognition apparatus includes a character extracting unit 1 for extracting a character image for each character from a character image, a character image comparing unit 2 for comparing character images, a character for performing character recognition. It comprises a recognition unit 3, a character image storage unit 4 for storing character images, a control unit 6 for controlling each unit, and a character attribute comparison unit 7 for comparing character attributes and calculating similarity. I have. It is assumed that the storage unit 5 in the character image storage unit 4 includes character attributes, information on character images, character codes, and frequency information.

【０００５】かかる構成において、文字切出し部１は、
文字画像から１文字ごとの文字画像を切出し、文字画像
比較部２、文字認識部３、文字画像記憶部４に渡すとと
もに、切出した文字画像の文字属性を文字属性比較部７
及び文字画像記憶部４に渡す。In such a configuration, the character extracting section 1
A character image for each character is cut out from the character image and passed to the character image comparing unit 2, the character recognizing unit 3, and the character image storing unit 4, and the character attributes of the cut out character image are compared with the character attribute comparing unit 7.
And to the character image storage unit 4.

【０００６】文字属性比較部７は、文字切出し部１から
渡される文字属性と、制御部６によって読出された文字
画像記憶部４に記憶されている文字属性とを比較し、類
似度を計算する。そして、予め定められた値より類似度
が大きい場合には、文字画像比較部２と文字認識部３の
計算を継続するように制御部６に通知する。予め定めら
れた値より類似度が小さい場合には、文字画像比較部２
で行われている文字画像に関する情報との類似度計算を
中止するように、制御部６に通知し、文字切出し部１よ
り新たな文字画像が出力されるまで、動作を停止する。The character attribute comparison unit 7 compares the character attribute passed from the character cutout unit 1 with the character attribute stored in the character image storage unit 4 read by the control unit 6, and calculates the similarity. . If the similarity is larger than the predetermined value, the control unit 6 is notified to continue the calculation by the character image comparison unit 2 and the character recognition unit 3. If the similarity is smaller than the predetermined value, the character image comparison unit 2
The control unit 6 is instructed to stop the calculation of the similarity with the information on the character image, which is performed in (1), and the operation is stopped until a new character image is output from the character cutout unit 1.

【０００７】[0007]

【発明が解決しようとする課題】上述した構成では、類
似度の計算を行った後、改めて類似度の大小を判断し、
計算の続行か中止かを判断しているため、次の単語に移
るまでの処理時間が増大するという欠点がある。In the above-described configuration, after the similarity is calculated, the magnitude of the similarity is determined again.
Since it is determined whether to continue or stop the calculation, there is a disadvantage that the processing time required to move to the next word increases.

【０００８】また、上述した構成では、既に１文字分の
文字を認識していることを仮定しているが、１文字が細
分化された文字パターンデータの場合は、判断不可能で
ある。このような場合、従来では、隣り合った各文字デ
ータを予め定められた接続組み合わせ分の累積類似度計
算をすることで１文字となる組み合わせ数を導き出して
いた。この従来の構成では、時として余分な接続組み合
わせの累積類似度計算を強いられ、処理時間の増大を招
くという欠点がある。In the above-described configuration, it is assumed that one character has already been recognized. However, in the case of character pattern data in which one character is subdivided, it cannot be determined. In such a case, conventionally, the number of combinations that become one character has been derived by calculating the cumulative similarity of adjacent character data for a predetermined connection combination. This conventional configuration has a drawback that the calculation of the cumulative similarity of the extra connection combinations is sometimes forced, which leads to an increase in the processing time.

【０００９】さらにまた、上述した構成では、毎回１文
字分の認識結果を出力するようになっており、最も類似
している結果がどの結果であるか、改めて処理しなけれ
ばならないという欠点がある。Furthermore, in the above-described configuration, a recognition result for one character is output each time, and there is a drawback in that the most similar result must be processed again. .

【００１０】本発明は上述した従来技術の欠点を解決す
るためになされたものであり、その目的は累積類似度計
算の時間を短縮すると同時に、認識精度を低下させず
に、高速に単語認識結果を得ることができる文字認識装
置を提供することである。SUMMARY OF THE INVENTION The present invention has been made to solve the above-mentioned drawbacks of the prior art, and has as its object to shorten the time for calculating the cumulative similarity and at the same time to reduce the word recognition result without reducing the recognition accuracy. Is to provide a character recognition device that can obtain

【００１１】[0011]

【課題を解決するための手段】本発明による文字認識装
置は、１単語データの各１文字分が細分化された切出し
文字パターンデータを１単語分として認識する文字認識
装置において、単語が記憶されている単語辞書と、切出
された文字パターンデータの切出し文字数、各文字類似
度、各文字接続数、有効単語文字数を記憶する文字パタ
ーン記憶手段と、前記単語辞書と前記文字パターン記憶
手段の情報を順次読出すアドレス制御手段と、前記文字
パターン記憶手段に記憶されている有効単語文字数と単
語辞書より順次読出された単語文字数との比較結果が文
字数範囲内でない場合は、前記単語辞書に記憶されてい
る次単語を読出すように前記アドレス制御手段を制御す
る文字数比較手段と、前記文字パターン記憶手段に記憶
されている各文字パターンデータの各文字接続数及び文
字パターンの各文字に対する類似度である各文字類似度
並びに前記文字数比較手段の比較結果に基づいて、切出
された文字パターンデータがどれだけ単語辞書の各単語
と類似しているか累積類似度計算する累積類似度計算手
段と、累積類似度を記憶する単語候補記憶手段と、前記
累積類似度計算手段で計算された累積類似度と前記単語
候補記憶手段に記憶されている累積類似度とを比較し単
語候補とする場合には前記単語候補記憶手段に記憶させ
る計算結果比較手段と、前記単語候補記憶手段に記憶さ
れた累積類似度を順に並べ換える順位付け手段とを有す
ることを特徴とする。そして、前記累積類似度計算手段
は、前記文字パターン記憶手段に記憶されている１文字
が細分化された各文字パターンデータの文字接続数に従
い隣り合った各文字データを１文字として接続し、前記
文字パターン記憶手段に記憶されている各文字類似度か
らどれだけ単語辞書の各単語と類似しているか計算処理
することを特徴とする。また、前記文字数比較手段は、
前記文字パターン記憶手段に記憶されている有効単語文
字数と単語辞書より順次読出された単語文字数を比較
し、結果が文字数範囲内でない場合は、上記累積類似度
計算手段の計算処理をせずに前記単語辞書に記憶されて
いる次単語を読出すように前記アドレス制御手段にを制
御することを特徴とする。さらに、前記順位付け手段
は、前記単語候補記憶手段の結果である累積類似度につ
いて予め決められた数の最有力候補のみを順に並べ変え
ることを特徴とする。According to the character recognition apparatus of the present invention, words are stored in a character recognition apparatus for recognizing cut-out character pattern data obtained by subdividing each character of one word data into one word. Word dictionary, character pattern storage means for storing the number of characters extracted, character similarity, each character connection number, effective word character number of the extracted character pattern data, and information of the word dictionary and the character pattern storage means Address control means for sequentially reading the word pattern, and if the result of comparison between the number of effective word characters stored in the character pattern storage means and the number of word characters sequentially read from the word dictionary is not within the character number range, the result is stored in the word dictionary. Character number comparing means for controlling the address control means so as to read the next word, and each character stored in the character pattern storing means. Based on each character connection number of the turn data and each character similarity that is the similarity to each character of the character pattern and the comparison result of the character number comparison means, how much of the extracted character pattern data matches each word of the word dictionary. Cumulative similarity calculating means for calculating similarity or cumulative similarity; word candidate storage means for storing cumulative similarity; and cumulative similarity calculated by the cumulative similarity calculating means and stored in the word candidate storage means. A calculation result comparison means for comparing the cumulative similarity stored in the word candidate storage means and a ranking means for sequentially rearranging the cumulative similarity stored in the word candidate storage means when comparing the accumulated similarity with the word candidate. It is characterized by having. Then, the cumulative similarity calculation means connects each character data adjacent to each other as one character according to the number of character connections of each character pattern data obtained by dividing one character stored in the character pattern storage means, It is characterized in that a calculation process is performed based on each character similarity stored in the character pattern storage means to determine how similar each word is in the word dictionary. Further, the character number comparing means includes:
The number of effective word characters stored in the character pattern storage means is compared with the number of word characters sequentially read from the word dictionary. If the result is not within the character number range, the cumulative similarity calculation means does not perform the calculation processing. The address control means is controlled to read the next word stored in the word dictionary. Further, the ranking unit is characterized in that only a predetermined number of the most probable candidates for the cumulative similarity as a result of the word candidate storage unit are rearranged in order.

【００１２】本発明による文字認識方法は、単語が記憶
されている単語辞書と、切出された文字パターンデータ
の切出し文字数、各文字類似度、各文字接続数、有効単
語文字数を記憶する文字パターン記憶部とを含み、１単
語データの各１文字分が細分化された切出し文字パター
ンデータを１単語分として認識する文字認識装置におけ
る文字認識方法であって、前記文字パターン記憶部に記
憶されている有効単語文字数と単語辞書より順次読出さ
れた単語文字数との比較結果が文字数範囲内でない場合
は、前記単語辞書に記憶されている次単語を読出す文字
数比較ステップと、前記文字パターン記憶部に記憶され
ている各文字パターンデータの各文字接続数及び文字パ
ターンの各文字に対する類似度である各文字類似度並び
に前記文字数比較ステップの比較結果に基づいて、切出
された文字パターンデータがどれだけ単語辞書の各単語
と類似しているか累積類似度計算する累積類似度計算ス
テップと、この計算された累積類似度と単語候補記憶部
に記憶されている累積類似度とを比較し単語候補とする
場合には前記単語候補記憶部に記憶させる計算結果比較
ステップと、前記単語候補記憶部に記憶された累積類似
度を順に並べ換える順位付けステップとを含むことを特
徴とする。A character recognition method according to the present invention provides a word dictionary in which words are stored, and a character pattern storing the number of characters extracted, character similarity, number of character connections, and number of effective word characters of the extracted character pattern data. A character recognition method for a character recognition apparatus that recognizes cut-out character pattern data in which each character of one word data is subdivided as one word, including a storage unit, wherein the character pattern data is stored in the character pattern storage unit. If the comparison result between the number of valid word characters and the number of word characters sequentially read from the word dictionary is not within the character number range, a character number comparison step of reading the next word stored in the word dictionary; Each character connection number of each stored character pattern data, each character similarity which is the similarity to each character of the character pattern, and the character number comparison A cumulative similarity calculating step for calculating how much the extracted character pattern data is similar to each word in the word dictionary based on the comparison result of the steps; a cumulative similarity calculating step for calculating the cumulative similarity and the word candidate; When comparing the cumulative similarity stored in the storage unit as a word candidate, the calculation result comparison step to be stored in the word candidate storage unit and the cumulative similarity stored in the word candidate storage unit are rearranged in order. And a ranking step.

【００１３】本発明による文字認識プログラムを記録し
た記録媒体は、単語が記憶されている単語辞書と、切出
された文字パターンデータの切出し文字数、各文字類似
度、各文字接続数、有効単語文字数を記憶する文字パタ
ーン記憶部とを含み、１単語データの各１文字分が細分
化された切出し文字パターンデータを１単語分として認
識する文字認識装置を制御するための文字認識プログラ
ムを記録した記録媒体であって、該文字認識プログラム
は、前記文字パターン記憶部に記憶されている有効単語
文字数と単語辞書より順次読出された単語文字数との比
較結果が文字数範囲内でない場合は、前記単語辞書に記
憶されている次単語を読出す文字数比較ステップと、前
記文字パターン記憶部に記憶されている各文字パターン
データの各文字接続数及び文字パターンの各文字に対す
る類似度である各文字類似度並びに前記文字数比較ステ
ップの比較結果に基づいて、切出された文字パターンデ
ータがどれだけ単語辞書の各単語と類似しているか累積
類似度計算する累積類似度計算ステップと、この計算さ
れた累積類似度と単語候補記憶部に記憶されている累積
類似度とを比較して単語候補とする場合には前記単語候
補記憶部に記憶させる計算結果比較ステップと、前記単
語候補記憶部に記憶された累積類似度を順に並べ換える
順位付けステップとを含むことを特徴とする。The recording medium storing the character recognition program according to the present invention includes a word dictionary in which words are stored, the number of characters to be extracted from the extracted character pattern data, the degree of similarity of each character, the number of connected characters, and the number of effective word characters. And a character pattern storage unit for storing a character recognition program for controlling a character recognition device for recognizing cut-out character pattern data in which each character of one word data is subdivided as one word. Medium, the character recognition program stores the effective word character number stored in the character pattern storage unit and the word character number sequentially read from the word dictionary if the comparison result is not within the character number range. A character number comparing step of reading out the stored next word; and a character connection of each character pattern data stored in the character pattern storage unit. Based on each character similarity, which is the similarity of each character of the number and the character pattern, and the comparison result of the number of characters comparing step, how much the extracted character pattern data is similar to each word of the word dictionary is cumulatively similar. A cumulative similarity calculating step of calculating the degree of similarity, and comparing the calculated cumulative similarity with the cumulative similarity stored in the word candidate storage to store the same in the word candidate storage when the word candidate is used as a word candidate. The method further includes a calculation result comparison step, and a ranking step of sequentially rearranging the cumulative similarities stored in the word candidate storage unit.

【００１４】要するに本文字認識装置では、文字認識処
理の中で処理時間が多くかかる累積類似度計算をする前
段階で、認識すべき単語の文字パターン情報の文字数を
比較することで処理時間を短縮し、更に、１文字が細分
化された文字パターンデータにおいて、認識すべき単語
の文字パターン情報の各文字接続数を参照することで、
１文字となる組み合わせ数を導き出す処理時間を短縮す
ることができる。また、多数の単語を納めた単語辞書よ
り、予め定められた単語数の上位候補を並び換えてその
結果を出力することができる。In short, in the character recognition apparatus, the processing time is shortened by comparing the number of characters in the character pattern information of the word to be recognized before the cumulative similarity calculation requiring a long processing time in the character recognition processing is performed. Further, in the character pattern data in which one character is subdivided, by referring to each character connection number of the character pattern information of the word to be recognized,
The processing time for deriving the number of combinations that make up one character can be reduced. Also, from a word dictionary containing a large number of words, it is possible to rearrange upper candidates having a predetermined number of words and output the result.

【００１５】[0015]

【発明の実施の形態】次に、本発明の実施の一形態につ
いて図面を参照して説明する。図１は本発明による文字
認識装置の実施の一形態を示すブロック図である。Next, an embodiment of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram showing an embodiment of a character recognition device according to the present invention.

【００１６】同図において本装置は、文字パターン記憶
部１１及び単語辞書記憶部１３に対して、データ読出し
のための制御信号を与えるアドレス制御部１０と、切出
し文字数、各文字類似度、各文字接続数、有効単語文字
数を文字パターン記憶単位１２として記憶する文字パタ
ーン記憶部１１と、各単語についての辞書単語番号、単
語文字情報、単語文字数を記憶する単語辞書記憶部１３
と、文字パターン記憶単位１２の有効単語文字数及び単
語記憶単位１４の単語文字数から計算対象の単語である
かどうかの比較判断を行う文字数比較部１５と、計算処
理の指示と文字比較部６からの単語文字数を受取り、文
字パターン記憶単位１２の切出し文字数、各文字類似
度、各文字接続数を基に累積類似度を計算する累積類似
度計算部１６と、累積類似度計算部１６で計算された計
算結果と単語候補記憶部１８に記憶されている計算結果
とを比較する計算結果比較部１７と、単語候補を記憶す
る単語候補記憶部１８と、単語候補記憶部１８に記憶さ
れた単語候補について累積類似度の大きいものから順に
順位付けする結果順位付け部１９とを含んで構成されて
いる。Referring to FIG. 1, the apparatus includes an address control unit 10 for providing a control signal for reading data to a character pattern storage unit 11 and a word dictionary storage unit 13, a number of cutout characters, each character similarity, and each character. A character pattern storage unit 11 that stores the number of connections and the number of valid word characters as a character pattern storage unit 12, and a word dictionary storage unit 13 that stores a dictionary word number, word character information, and the number of word characters for each word.
A number-of-characters comparing unit 15 for comparing and determining whether the word is a word to be calculated based on the number of effective word characters in the character pattern storage unit 12 and the number of word characters in the word storage unit 14; The cumulative similarity calculator 16 receives the number of word characters and calculates the cumulative similarity based on the number of cut-out characters, each character similarity, and each character connection number of the character pattern storage unit 12, and the cumulative similarity calculator 16 calculates the cumulative similarity. A calculation result comparison unit 17 that compares the calculation result with a calculation result stored in the word candidate storage unit 18, a word candidate storage unit 18 that stores word candidates, and a word candidate stored in the word candidate storage unit 18. And a result ranking unit 19 for ranking in descending order of cumulative similarity.

【００１７】アドレス制御部１０は、制御信号１００に
より文字パターン記憶部１１と、単語辞書記憶部１３に
対して、データ読出しのための制御信号を与える。これ
により、文字パターン記憶部１１から信号線１０１によ
って文字比較部６に文字パターン記憶単位１２の有効単
語文字数が送られ、同時に文字パターン記憶部１１から
信号線１０５によって累積類似度計算部１６に文字パタ
ーン記憶単位１２の切出し文字数、各文字類似度、各文
字接続数が送られる。The address control unit 10 supplies a control signal for reading data to the character pattern storage unit 11 and the word dictionary storage unit 13 by a control signal 100. As a result, the number of valid words in the character pattern storage unit 12 is transmitted from the character pattern storage unit 11 to the character comparison unit 6 via the signal line 101 and simultaneously transmitted to the cumulative similarity calculation unit 16 via the signal line 105 from the character pattern storage unit 11. The number of cut-out characters, the degree of similarity of each character, and the number of connected characters in the pattern storage unit 12 are sent.

【００１８】また、単語辞書記憶部１３から信号線１０
２によって文字比較部６に単語記憶単位１４の単語文字
数が送られ、同時に単語辞書記憶部１３から信号線１０
６によって累積類似度計算部１６に単語辞書記憶部１３
の単語記憶単位１４の辞書単語番号、単語文字情報が送
られる。Further, the word line storage 13 stores the signal line 10
2, the number of word characters in the word storage unit 14 is sent to the character comparison unit 6, and at the same time, the signal line 10
6 to the word dictionary storage unit 13
The dictionary word number and word character information of the word storage unit 14 are sent.

【００１９】文字数比較部１５は、信号線１０１によっ
て送られた、文字パターン記憶単位１２の有効単語文字
数と信号線１０２によって送られた単語記憶単位１４の
単語文字数から計算対象の単語であるかどうかの比較判
断をする。また、アドレス制御部１０は、制御信号１０
３によっても同様の動作をし、単語辞書記憶部１３に対
しては、次単語の情報を読出すように制御信号１００を
制御する。The number-of-characters comparing section 15 determines whether the word is a word to be calculated based on the number of valid words in the character pattern storage unit 12 sent by the signal line 101 and the number of words in the word storage unit 14 sent by the signal line 102. Make a comparison judgment. Further, the address control unit 10 controls the control signal 10
3, the control signal 100 is controlled so that the word dictionary storage unit 13 reads the information of the next word.

【００２０】文字パターン記憶部１１には、認識すべき
単語の文字パターン情報として、切出し文字数、各文字
類似度、各文字接続数及び有効単語文字数が、文字パタ
ーン記憶単位１２として記憶されている。The character pattern storage unit 11 stores, as character pattern information of a word to be recognized, the number of cut-out characters, each character similarity, each character connection number, and the number of valid word characters as a character pattern storage unit 12.

【００２１】文字パターン記憶単位１２の切出し文字数
は、認識すべき単語がいくつの文字パターンとして切出
されたかを数値で示す。その各切出された文字パターン
は、１文字パターンで１文字として認識できるものもあ
れば、いくつかの文字パターンで１文字として認識され
るものもある。つまり、「切出し文字数」とは、文字認
識処理において切出された文字の数をいう。例えば、
「ｂｕｍ」という文字を認識する場合、本装置に入力さ
れる文字パターンデータが、「ｂｕｍ」や「ｂｕｒｎ」
である可能性がある。この場合の切出し文字数は、「３
文字」か「４文字」である。The number of extracted characters in the character pattern storage unit 12 indicates the number of character patterns of the word to be recognized as a numerical value. Some of the extracted character patterns can be recognized as one character by one character pattern, and others can be recognized as one character by several character patterns. That is, the “number of extracted characters” refers to the number of characters extracted in the character recognition processing. For example,
When recognizing the character "bum", the character pattern data input to the apparatus is "bum" or "burn".
Could be In this case, the number of extracted characters is “3
Character "or" 4 characters ".

【００２２】文字パターン記憶単位１２の各文字類似度
は、上記で説明した文字パターンが１文字として認識で
きる類似度が類似度として数値で示される。類似度は、
１文字パターンで１文字として認識されるのか、隣り合
ったいくつかの文字パターンで１文字として認識される
のかを数値で示す。本例では、各文字類似度が大きいほ
ど類似度が高く、小さいほど類似度が小さいものとす
る。Each character similarity of the character pattern storage unit 12 is represented by a numerical value indicating the similarity with which the character pattern described above can be recognized as one character. The similarity is
Numerical values indicate whether a character pattern is recognized as one character or several adjacent character patterns are recognized as one character. In this example, it is assumed that the larger the character similarity is, the higher the similarity is, and the smaller the character similarity is, the smaller the similarity is.

【００２３】つまり、「文字類似度」とは、切出された
文字それぞれがどんな文字と似ているかを数値で表した
ものである。例えば、「ａ」という文字が１文字目であ
る可能性が数値として表される。また、２文字目である
可能性が数値として表され、同様にｎ文字目である可能
性が数値として表される。この「ｎ」は、有効単語文字
数の値によって変化する。That is, the "character similarity" is a numerical value representing what each of the extracted characters is similar to. For example, the possibility that the character “a” is the first character is expressed as a numerical value. Also, the possibility of the second character is represented as a numerical value, and similarly, the possibility of the nth character is represented as a numerical value. This “n” changes depending on the value of the number of valid word characters.

【００２４】文字パターン記憶単位１２における各文字
接続数は、本発明で特徴となる構成要素であり、上記で
説明した文字パターンがいくつまでの接続範囲内で１文
字として認識されるかを数値で示す。The number of each character connection in the character pattern storage unit 12 is a characteristic feature of the present invention, and a numerical value indicates how many connection ranges the character pattern described above is recognized as one character. Show.

【００２５】従来、累積類似度の計算は予め決められた
回数分の計算をし、その中から類似度が一番最大になる
文字パターンの接続文字数を選んでいた。一方、文字パ
ターン記憶単位１２の各文字接続数は、認識すべき単語
の文字パターンごとに付けられており、最大の類似度は
この文字接続数分だけ計算を実行することで求めること
が可能となる。Conventionally, the cumulative similarity has been calculated a predetermined number of times, and the number of connected characters of the character pattern having the highest similarity has been selected from among the calculations. On the other hand, the number of character connections in the character pattern storage unit 12 is provided for each character pattern of the word to be recognized, and the maximum similarity can be obtained by performing calculations for the number of character connections. Become.

【００２６】つまり「文字接続数」とは、切出された文
字それぞれが隣合った文字からいくつ接続する可能性が
あるかを示す数である。例えば、「ｂｕｒｎ」という文
字であれば、後ろから順に着目すると、４文字目の
「ｎ」は３文字目の「ｒ」と接続し「ｍ」となる可能性
がある。次に、「ｒ」は「ｕ」と接続する可能性はな
い。また、「ｕ」は「ｂ」と接続する可能性はない。し
たがって、この「ｂｕｒｎ」の場合、「ｂ」，「ｒ」及
び「ｕ」の文字接続数は「０」、「ｎ」の文字接続数は
「１」となる。That is, the “number of connected characters” is a number indicating how many of the extracted characters are likely to be connected from adjacent characters. For example, in the case of the character "burn", when focusing on the character in order from the back, there is a possibility that the fourth character "n" is connected to the third character "r" and becomes "m". Second, "r" has no possibility to connect with "u". Also, “u” has no possibility of being connected to “b”. Therefore, in the case of this "burn", the number of character connections of "b", "r" and "u" is "0", and the number of character connections of "n" is "1".

【００２７】文字パターン記憶単位１２における有効単
語文字数は、本発明で特徴となる構成要素であり、認識
すべき単語が何文字で構成されうるか、文字数の有効範
囲を数値で示す。この有効単語文字数は、認識すべき単
語の文字数を最小値と最大値で示している。例えば、最
小値が３、最大値が６である場合、認識すべき単語は、
３文字から６文字の文字数範囲内の単語でありうる可能
性を数値で示す。The number of valid word characters in the character pattern storage unit 12 is a characteristic feature of the present invention, and indicates the number of characters that can constitute a word to be recognized and a numerical value indicating the effective range of the number of characters. The number of valid word characters indicates the number of characters of a word to be recognized by a minimum value and a maximum value. For example, if the minimum value is 3 and the maximum value is 6, the words to be recognized are:
A numerical value indicates a possibility that the word can be a word within the range of three to six characters.

【００２８】つまり「有効単語文字数」とは、文字認識
処理中に必要とされるデータであり、何文字の単語であ
る可能性が高いかを最小値から最大値で設定される文字
数である。単語辞書に登録されている単語のどれと似て
いるかを本装置で算出する場合、最初に入力される文字
パターンデータの状態では、何文字の単語であるのかわ
からないので、その可能性を示す数が有効単語文字数で
ある。That is, the "number of valid word characters" is data required during the character recognition process, and is the number of characters that is set from the minimum value to the maximum value of how many words are likely to be a word. When calculating which word is similar to one of the words registered in the word dictionary, the number of words indicating the possibility is unknown because the character pattern data input first does not tell how many words the word is. Is the number of valid word characters.

【００２９】また、単語辞書記憶部１３には、予め用意
された各単語の情報として、辞書単語番号と、単語文字
情報、単語文字数が、単語記憶単位１４として記憶され
ている。単語辞書記憶部１３には、単語数分の単語記憶
単位１４を記憶している。The word dictionary storage unit 13 stores a dictionary word number, word character information, and the number of word characters as word storage units 14 as information of each word prepared in advance. The word dictionary storage unit 13 stores word storage units 14 for the number of words.

【００３０】単語記憶単位１４の辞書単語番号は、各単
語に予め順次付けられた番号である。単語記憶単位１４
の単語文字情報は、どんな文字によって構成された単語
であるかを示す情報である。例えば、アルファベットの
どの文字が１番目の文字で、どの文字が２番目の文字で
あるかという情報が記憶されている。The dictionary word number of the word storage unit 14 is a number sequentially assigned to each word in advance. Word storage unit 14
The word character information is information indicating what kind of character the word is composed of. For example, information indicating which character of the alphabet is the first character and which character is the second character is stored.

【００３１】単語記憶単位１４の単語文字数は、各単語
が何文字で構成された単語であるかを数値で示す。The number of words in the word storage unit 14 indicates the number of characters in each word by a numerical value.

【００３２】文字数比較部１５は、従来の構成要素には
ないものであり、信号線１０１によって送られた、文字
パターン記憶単位１２の有効単語文字数と信号線１０２
によって送られた単語記憶単位１４の単語文字数とから
計算対象の単語であるかどうかの比較判断をし、計算対
象の単語でなければアドレス制御部１０に単語辞書記憶
部１３の次単語を読出すように制御信号１０３を出力す
る。The number-of-characters comparing section 15 is not included in the conventional constituent elements. The number-of-characters comparing section 15 transmits the number of effective word characters of the character pattern storage unit 12 and the signal line 102 transmitted by the signal line 101.
A comparison is made based on the number of word characters in the word storage unit 14 sent from the computer to determine whether the word is a calculation target word. If the word is not a calculation target word, the next word in the word dictionary storage unit 13 is read out to the address control unit 10. Control signal 103 is output as described above.

【００３３】例えば、文字パターン記憶単位１２の有効
単語文字数が３文字から６文字で、現在読出されている
単語記憶単位１４の単語文字数が８文字である場合、ア
ドレス制御部１０に単語辞書記憶部１３の次単語を読出
すように指示し、単語辞書記憶部１３より次の新たな単
語文字数が読出された時、再度、計算対象の単語である
かどうかの比較判断し、計算対象の単語が読出されるか
単語辞書記憶部１３の全ての単語と比較されるまで繰返
す。For example, if the number of valid word characters in the character pattern storage unit 12 is 3 to 6 characters and the number of word characters in the currently read word storage unit 14 is 8, the address control unit 10 stores the word dictionary storage unit. 13 is read out, and when the next new word character number is read from the word dictionary storage unit 13, it is again determined whether or not the word is a calculation target word. The process is repeated until the data is read out or compared with all the words in the word dictionary storage unit 13.

【００３４】累積類似度計算部１６は、制御信号１０４
によって送られる、計算処理の指示と文字比較部６から
の単語文字数とを受取り、アドレス制御部１０により読
出された文字パターン記憶単位１２の切出し文字数と、
各文字類似度と、各文字接続数が信号線１０５によって
送られ、単語記憶単位１４の辞書番号と、単語文字情報
が信号線１０６によって送られたデータを元に周知のよ
うに累積類似度を計算する。The cumulative similarity calculator 16 controls the control signal 104
Receiving the instruction of the calculation process and the number of word characters from the character comparison unit 6 sent by the address control unit 10, the number of cut-out characters of the character pattern storage unit 12 read by the address control unit 10,
The character similarity and the number of connected characters are transmitted by the signal line 105, and the dictionary number of the word storage unit 14 and the word character information are stored in a known manner based on the data transmitted by the signal line 106. calculate.

【００３５】この累積類似度を計算する場合、「文字類
似度」を「文字接続数」及び「単語辞書」にしたがって
加算していく。例えば、「ｂｕｍ」という文字を認識す
る場合、文字データパターンが「ｂｕｍ」や「ｂｕｒ
ｎ」である可能性があるので、３文字か４文字であるこ
とを仮定して計算する。「ｂ」の１文字目としての文字
類似度＋「ｕ」の２文字目としての文字類似度＋「ｍ」
の３文字目としての文字類似度の累積類似度と、「ｂ」
の１文字目としての文字類似度＋「ｕ」の２文字目とし
ての文字類似度＋「ｒ」の３文字目としての文字類似度
＋「ｎ」の４文字目としての文字類似度の累積類似度と
を比較し、累積類似度の大きい方（最も似ている方）を
候補とする。When calculating the cumulative similarity, the "character similarity" is added according to the "number of connected characters" and the "word dictionary". For example, when recognizing a character "bum", the character data pattern is "bum" or "bur".
Since the number may be "n", the calculation is performed assuming that the number is three or four. Character similarity as the first character of "b" + character similarity as the second character of "u" + "m"
And the cumulative similarity of the character similarity as the third character of "b"
Of character similarity as first character + character similarity as second character of "u" + character similarity as third character of "r" + character similarity as fourth character of "n" The similarity is compared, and the one with the larger cumulative similarity (the most similar one) is set as a candidate.

【００３６】計算結果比較部１７は、制御線１０７から
送られる累積類似度計算部１６で計算された計算結果と
制御線１０８から送られる単語候補記憶部１８に記憶さ
れている計算結果とを比較し、予め定められた数のみ制
御線１０９より単語候補記憶部１８に記憶する。例え
ば、予め１０個の単語候補を記憶するのであれば、単語
候補記憶部１８には、計算結果として累積類似度がもっ
とも大きいものから１０個の累積類似度と辞書単語番号
が記憶される。The calculation result comparison unit 17 compares the calculation result sent from the control line 107 by the cumulative similarity calculation unit 16 with the calculation result sent from the control line 108 and stored in the word candidate storage unit 18. Then, only a predetermined number is stored in the word candidate storage unit 18 via the control line 109. For example, if ten word candidates are stored in advance, the word candidate storage unit 18 stores the ten cumulative similarities and the dictionary word numbers from the one with the largest cumulative similarity as a calculation result.

【００３７】単語候補記憶部１８は、累積類似度計算の
計算結果と辞書単語番号を予め定められた数分、記憶す
る部分である。The word candidate storage section 18 is a section for storing the calculation results of the cumulative similarity calculation and dictionary word numbers for a predetermined number.

【００３８】結果順位付け部１９は、単語候補記憶部１
８に記憶された単語候補を制御線１１０により読取り、
累積類似度の大きいものから順に順位付けするものであ
る。この順位付け結果は制御線１１１により単語候補記
憶部１８に送られ、単語候補情報の記憶をやり直すこと
になる。The result ranking unit 19 stores the word candidate storage unit 1
8 is read through the control line 110,
The ranking is made in order from the one with the largest cumulative similarity. This ranking result is sent to the word candidate storage unit 18 via the control line 111, and the word candidate information is stored again.

【００３９】[0039]

【実施例】以下、本発明のより詳細な実施例について図
１を参照して説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS A more detailed embodiment of the present invention will be described below with reference to FIG.

【００４０】ここでは、説明を簡単にするために信号線
１０１によって送られた文字パターン記憶単位１２の有
効単語文字数が３文字から６文字であるものとする。ま
た、信号線１０２によって送られた単語記憶単位１４の
単語文字数が８文字とする。すると現在読み込まれた辞
書単語は計算対象文字数でないため、制御信号１０３に
より、アドレス制御部１０に次単語の読出しを指示す
る。それにより、新たに次の単語の情報が単語辞書記憶
部１３から信号線１０２によって文字比較部６に単語記
憶単位１４の単語文字数が送られ、同時に単語辞書記憶
部１３から信号線１０６によって累積類似度計算部１６
に単語辞書記憶部１３の単語記憶単位１４の辞書単語番
号、単語文字情報が送られる。そして、文字数比較部１
５は、再度、計算対象の単語であるかどうかの比較判断
し、計算対象の単語数である３文字から６文字の単語数
の単語が読出されるか単語辞書記憶部１３の全ての単語
と比較されるまで繰返す。Here, for the sake of simplicity, it is assumed that the number of effective word characters of the character pattern storage unit 12 transmitted by the signal line 101 is 3 to 6 characters. It is also assumed that the word storage unit 14 sent by the signal line 102 has eight word characters. Then, since the currently read dictionary word is not the number of characters to be calculated, the control signal 103 instructs the address control unit 10 to read the next word. As a result, the information of the next word is newly sent from the word dictionary storage unit 13 to the character comparison unit 6 via the signal line 102 via the signal line 102, and at the same time the cumulative similarity is sent from the word dictionary storage unit 13 via the signal line 106. Degree calculator 16
The dictionary word number and word character information of the word storage unit 14 of the word dictionary storage unit 13 are sent to the server. Then, the character number comparison unit 1
5 again determines whether or not the word is a calculation target word, and reads out a word having a word number of 3 to 6 characters, which is the number of words to be calculated, or checks all words in the word dictionary storage unit 13 Repeat until compared.

【００４１】もし、単語文字数が３文字から６文字であ
れば、累積類似度計算部１６に制御信号１０４によっ
て、計算処理の指示と単語文字数を送る。If the number of word characters is three to six, the control signal 104 sends an instruction for calculation processing and the number of word characters to the cumulative similarity calculator 16.

【００４２】累積類似度計算部１６は、制御信号１０４
によって送られる、計算処理の指示と文字比較部６から
の単語文字数を受取り、アドレス制御部１０により、読
出された文字パターン記憶単位１２の切出し文字数と、
各文字類似度と、各文字接続数が信号線１０５によって
送られ、単語記憶単位１４の辞書番号と、単語文字情報
が信号信号１０６によって送られたデータを元に累積類
似度を計算する。累積類似度計算部１６は、信号線１０
５によって送られた情報と信号線１０６によって送られ
た情報により累積類似度を周知のようにして算出する回
路である。この算出された累積類似度は、信号線１０６
によって送られた単語記憶単位１４の辞書番号と共に信
号線１０７を介して計算結果比較部１７に送られる。The cumulative similarity calculator 16 controls the control signal 104
Receiving the instruction of the calculation process and the number of word characters from the character comparison unit 6 sent by the address control unit 10,
Each character similarity and each character connection number are transmitted by the signal line 105, and the cumulative similarity is calculated based on the dictionary number of the word storage unit 14 and the word character information transmitted by the signal signal 106. The accumulative similarity calculation unit 16 calculates the signal line 10
5 is a circuit for calculating the accumulated similarity in a well-known manner based on the information transmitted by the signal line 5 and the information transmitted by the signal line 106. The calculated cumulative similarity is represented by the signal line 106
Along with the dictionary number of the word storage unit 14 sent to the calculation result comparison unit 17 via the signal line 107.

【００４３】計算結果比較部１７は、信号線１０７から
送られた累積類似度と制御線１０８を介して読出した単
語候補記憶部１８に記憶されている計算結果を比較す
る。単語候補記憶部１８の計算結果は、１０個の計算結
果が記憶される。計算結果比較部１７は、毎単語ごとに
単語候補記憶部１８の計算結果の中から最小の計算結果
（最も似ていないもの）を検索し、その検索された値と
信号線１０７から送られる計算結果の累積類似度と、辞
書単語番号を置き換え、制御線信号線１０９を介して記
憶する。The calculation result comparing section 17 compares the cumulative similarity sent from the signal line 107 with the calculation result read out via the control line 108 and stored in the word candidate storage section 18. As the calculation results of the word candidate storage unit 18, ten calculation results are stored. The calculation result comparison unit 17 searches the calculation results of the word candidate storage unit 18 for the smallest calculation result (the least similar one) for each word, and calculates the calculated value and the calculation value transmitted from the signal line 107. The resulting cumulative similarity and the dictionary word number are replaced and stored via the control line signal line 109.

【００４４】結果順位付け部１９は、単語候補記憶部１
８に記憶された１０個の単語候補を制御線１１０により
読取り、累積類似度の大きいものから順に順位付けし、
制御線１１１により単語候補記憶部１８に単語候補情報
の記憶をやり直す部分である。単語候補記憶部１８に累
積類似度がもっとも大きいものから順に１０個の累積類
似度と辞書単語番号が記憶される。The result ranking unit 19 stores the word candidate storage unit 1
The 10 word candidates stored in 8 are read by the control line 110, and are ranked in descending order of the cumulative similarity.
This is a portion where the word candidate information is stored again in the word candidate storage unit 18 by the control line 111. The word candidate storage unit 18 stores ten cumulative similarities and dictionary word numbers in order from the one with the largest cumulative similarity.

【００４５】この結果順位付け部１９による順位付け
は、単語辞書内のすべての単語照合が終了した後に行わ
れる。本例では、単語候補記憶部１８に最終的に１０候
補の単語が残ることになるが、最初は単語候補記憶部１
８に何も記憶されていない。つまり最初は最低値が初期
値として単語候補記憶部１８に記憶されている。計算結
果比較部１７によって「最小の計算結果」を検索して値
を置き換えた後、結果順位付け部１９で最後に順位付け
するのである。The ranking by the result ranking unit 19 is performed after all the words in the word dictionary have been collated. In this example, ten candidate words will eventually remain in the word candidate storage unit 18, but initially the word candidate storage unit 1
Nothing is stored in 8. That is, initially, the lowest value is stored in the word candidate storage unit 18 as an initial value. After the “minimum calculation result” is searched by the calculation result comparison unit 17 and the value is replaced, the result ranking unit 19 ranks the last.

【００４６】例えば、最低値を「−９９９９」とした場
合、初期値は、候補０：−９９９９候補１：−９９９９候補２：−９９９９候補３：−９９９９候補４：−９９９９候補５：−９９９９候補６：−９９９９候補７：−９９９９候補８：−９９９９候補９：−９９９９である。そして、累積類似度計算部１６による計算の結
果が、候補０：＋１００候補１：＋４０００候補２：＋８００候補３：−２０候補４：＋６０００候補５：＋２００候補６：−５０候補７：＋５５００候補８：＋３８００候補９：＋４５００であるものとする。For example, when the lowest value is "-9999", the initial value is: Candidate 0: -9999 Candidate 1: -9999 Candidate 3: -9999 Candidate 4: -9999 Candidate 5: -9999 Candidate 6: -9999 Candidate 7: -9999 Candidate 8: -9999 Candidate 9: -9999 The result of the calculation by the cumulative similarity calculation unit 16 is as follows: Candidate 0: +100 Candidate 1: +4000 Candidate 2: +800 Candidate 3: −Candidate 4: +6000 Candidate 5: +200 Candidate 6: −50 Candidate 7: +5500 Candidate 8 : +3800 Candidate 9: +4500

【００４７】この計算結果について結果順位付け部１９
による順位付けを行うと、候補０：＋６０００候補１：＋５５００候補２：＋４５００候補３：＋４０００候補４：＋３８００候補５：＋８００候補６：＋２００候補７：＋１００候補８：−２０候補９：−５０となる。The result ranking unit 19 calculates the calculation result.
In order of the following, the candidate 0: +6000 candidate 1: +5500 candidate 2: +4500 candidate 3: +4000 candidate 4: +3800 candidate 5: +800 candidate 6: +200 candidate 7: +100 candidate 8: -20 candidate 9: -50 Become.

【００４８】本発明の一実施例を上記に説明したが、本
発明においては上記の実施例のみに限らず、各種の付加
変更が可能である。例えば、文字パターン記憶単位１２
の有効単語文字数を３文字から６文字としたり、単語候
補記憶部１８に記憶される単語の候補数は１０個とした
が、その数を増減することも可能である。また、これら
は、固定値である必要はなく、例えば、レジスタやメモ
リ等の記憶手段を他に付加し、数値を変化させることも
可能である。また、結果順位付け部１９では、最大値の
累積類似度から順番に並べ換えているが、最小値から並
べ換えることも可能である。その他、本発明が上記実施
例に限定されず、本発明の技術思想の範囲内において、
各実施例は適宜変更され得ることは明らかである。Although one embodiment of the present invention has been described above, the present invention is not limited to the above embodiment, and various additions and changes are possible. For example, the character pattern storage unit 12
Although the number of valid word characters is changed from 3 to 6 or the number of word candidates stored in the word candidate storage unit 18 is set to 10, the number can be increased or decreased. Further, these need not be fixed values. For example, it is also possible to add storage means such as a register and a memory to change the numerical values. Further, in the result ranking unit 19, sorting is performed in order from the cumulative similarity of the maximum value, but it is also possible to sort from the minimum value. In addition, the present invention is not limited to the above embodiments, and within the scope of the technical idea of the present invention,
Obviously, each embodiment can be appropriately changed.

【００４９】ところで、本文字認識装置においては、単
語が記憶されている単語辞書と、切出された文字パター
ンデータの切出し文字数、各文字類似度、各文字接続
数、有効単語文字数を記憶する文字パターン記憶部とを
含み、１単語データの各１文字分が細分化された切出し
文字パターンデータを１単語分として認識する文字認識
装置における文字認識方法が実現されている。この文字
認識方法について図２を参照して説明する。In the character recognition apparatus, the word dictionary storing words and the characters storing the number of characters extracted from the extracted character pattern data, each character similarity, each character connection number, and the number of valid word characters are stored. A character recognition method in a character recognition device including a pattern storage unit and recognizing cut-out character pattern data in which each character of one word data is subdivided as one word is realized. This character recognition method will be described with reference to FIG.

【００５０】同図において、まず最初に、文字パターン
記憶部に記憶されている有効単語文字数と単語辞書より
順次読出された単語文字数との比較結果が文字数範囲内
でない場合は、前記単語辞書に記憶されている次単語を
読出す（ステップＳ２１）。次に、文字パターン記憶部
１２に記憶されている各文字パターンデータの各文字接
続数及び文字パターンの各文字に対する類似度である各
文字類似度並びにステップＳ２１の比較結果に基づい
て、切出された文字パターンデータがどれだけ単語辞書
の各単語と類似しているか累積類似度計算する（ステッ
プＳ２２）。さらに、この計算された累積類似度と単語
候補記憶部１８の記憶内容とを比較して単語候補として
記憶するかどうか判断する（ステップＳ２３）。この場
合、前回計算した累積類似度が単語候補記憶部１８に記
憶されていれば、その記憶されている累積類似度と比較
される。ただし、累積類似度が単語候補記憶部１８に記
憶されておらず、上述した「−９９９９」等の初期値が
記憶されている場合は、その初期値と比較される。In the figure, first, when the comparison result between the number of effective word characters stored in the character pattern storage unit and the number of word characters sequentially read from the word dictionary is not within the range of the number of characters, it is stored in the word dictionary. The next word that has been read is read (step S21). Next, based on the number of character connections of each character pattern data stored in the character pattern storage unit 12 and each character similarity, which is the similarity of each character of the character pattern, and the comparison result in step S21, the clipping is performed. A cumulative similarity calculation is performed to determine how similar the character pattern data is to each word in the word dictionary (step S22). Further, the calculated cumulative similarity is compared with the content stored in the word candidate storage unit 18 to determine whether or not to be stored as a word candidate (step S23). In this case, if the previously calculated cumulative similarity is stored in the word candidate storage unit 18, it is compared with the stored cumulative similarity. However, when the cumulative similarity is not stored in the word candidate storage unit 18 and an initial value such as “−9999” is stored, it is compared with the initial value.

【００５１】そして、ステップＳ２３で判断された累積
類似度を記憶する（ステップＳ２４）。最後に、この記
憶された累積類似度を順に並べ換えるのである（ステッ
プＳ２５）。Then, the cumulative similarity determined in step S23 is stored (step S24). Finally, the stored cumulative similarities are rearranged in order (step S25).

【００５２】ここで、ステップＳ２２においては、文字
パターン記憶部１２に記憶されている１文字が細分化さ
れた各文字パターンデータの文字接続数に従い隣り合っ
た各文字データを１文字として接続し、文字パターン記
憶部１２に記憶されている各文字類似度からどれだけ単
語辞書の各単語と類似しているか計算処理しているので
ある。Here, in step S22, adjacent character data are connected as one character according to the number of character connections of each character pattern data in which one character stored in the character pattern storage unit 12 is subdivided. A calculation process is performed to determine how similar each word in the word dictionary is from each character similarity stored in the character pattern storage unit 12.

【００５３】また、ステップＳ２１においては、文字パ
ターン記憶部に記憶されている有効単語文字数と単語辞
書より順次読出された単語文字数を比較した結果が文字
数範囲内でない場合は、ステップＳ２２の計算処理をせ
ずに単語辞書に記憶されている次単語を読出すように制
御しているのである。In step S21, if the result of comparing the number of effective word characters stored in the character pattern storage section with the number of word characters sequentially read from the word dictionary is not within the character number range, the calculation process in step S22 is performed. Instead, the next word stored in the word dictionary is controlled to be read.

【００５４】さらにまた、ステップＳ２５においては、
ステップＳ２４において記憶された累積類似度について
予め決められた数の最有力候補のみを順に並べ変えてい
るのである。Further, in step S25,
Only a predetermined number of the most probable candidates for the cumulative similarity stored in step S24 are rearranged in order.

【００５５】なお、以上説明した図２の処理を実現する
ためのプログラムを記録した記録媒体を用意し、これを
用いて図１の各部を制御すれば、上述と同様の動作を行
うことができることは明白である。この記録媒体には、
図１中に示されていない半導体メモリ、磁気ディスク装
置の他、種々の記録媒体を用いることができる。It is to be noted that the same operation as described above can be performed by preparing a recording medium on which a program for realizing the processing of FIG. 2 described above is recorded and controlling each unit of FIG. 1 using the recording medium. Is obvious. This recording medium contains
Various recording media can be used in addition to the semiconductor memory and the magnetic disk device not shown in FIG.

【００５６】また、同記録媒体に記録されているプログ
ラムによってコンピュータを制御すれば、上述と同様に
文字認識動作を行うことができることは明白である。こ
の記録媒体には、半導体メモリ、磁気ディスク装置の
他、種々の記録媒体を用いることができる。If the computer is controlled by the program recorded on the recording medium, it is obvious that the character recognition operation can be performed in the same manner as described above. As this recording medium, various recording media other than the semiconductor memory and the magnetic disk device can be used.

【００５７】[0057]

【発明の効果】以上説明したように本発明は、文字認識
処理の中で処理時間が多くかかる累積類似度計算をする
前段階で、認識すべき単語の文字パターン情報の文字数
を比較することで処理時間を短縮し、更に、１文字が細
分化された文字パターンデータにおいて、認識すべき単
語の文字パターン情報の各文字接続数を参照すること
で、１文字となる組み合わせ数を導き出す処理時間を短
縮することができるという効果がある。また、多数の単
語を納めた単語辞書より、予め定められた単語数の上位
候補を並び換えてその結果を出力することができるとい
う効果がある。As described above, the present invention compares the number of characters in the character pattern information of the word to be recognized before the cumulative similarity calculation which takes a long processing time in the character recognition processing. In the character pattern data in which one character is subdivided, the processing time for deriving the number of combinations that become one character is reduced by referring to the number of each character connection of the character pattern information of the word to be recognized. There is an effect that it can be shortened. In addition, there is an effect that it is possible to rearrange upper candidates having a predetermined number of words from a word dictionary storing a large number of words and output the result.

[Brief description of the drawings]

【図１】本発明の文字認識装置の実施の一形態を示す構
成図である。FIG. 1 is a configuration diagram showing one embodiment of a character recognition device of the present invention.

【図２】図１の装置によって実現される文字認識方法を
示すフローチャートである。FIG. 2 is a flowchart illustrating a character recognition method realized by the apparatus of FIG. 1;

【図３】従来の文字認識装置の一例を示す構成図であ
る。FIG. 3 is a configuration diagram illustrating an example of a conventional character recognition device.

[Explanation of symbols]

１０アドレス制御部１１文字パターン記憶部１２文字パターン記憶単位１３単語辞書記憶部１４単語記憶単位１５文字数比較部１６累積類似度計算部１７計算結果比較部１８単語候補記憶部１９結果順位付け部 Reference Signs List 10 Address control unit 11 Character pattern storage unit 12 Character pattern storage unit 13 Word dictionary storage unit 14 Word storage unit 15 Character number comparison unit 16 Cumulative similarity calculation unit 17 Calculation result comparison unit 18 Word candidate storage unit 19 Result ranking unit

Claims

[Claims]

1. A character recognition apparatus for recognizing cut-out character pattern data in which each character of one word data is subdivided as one word, comprising: a word dictionary in which words are stored; Character pattern storage means for storing the number of extracted characters of data, each character similarity, each character connection number, and the number of valid word characters; address control means for sequentially reading information from the word dictionary and the character pattern storage means;
If the comparison result between the number of effective word characters stored in the character pattern storage means and the number of word characters sequentially read from the word dictionary is not within the character number range, the next word stored in the word dictionary is read. A number-of-characters comparison unit that controls the address control unit; a character-similarity that is a similarity to each character of each character-pattern data stored in the character-pattern storage unit; Cumulative similarity calculating means for calculating how similar the extracted character pattern data is to each word in the word dictionary based on the comparison result of the means, and word candidate storing means for storing the cumulative similarity And comparing the cumulative similarity calculated by the cumulative similarity calculating means with the cumulative similarity stored in the word candidate storage means, Character recognition apparatus characterized by having a successively permuting ranking means and calculation results comparing means to be stored in the word candidate storage unit, the cumulative similarity stored in the word candidate storage means if that.

2. The method according to claim 1, wherein the accumulative similarity calculating means connects each character data adjacent to each other as one character according to the number of character connections of each character pattern data obtained by dividing one character stored in the character pattern storage means. 2. The character recognition device according to claim 1, wherein the character recognition device calculates a degree of similarity with each word in the word dictionary from each character similarity stored in the character pattern storage means.

3. The character number comparing means compares the number of effective word characters stored in the character pattern storage means with the number of word characters sequentially read from a word dictionary. 2. A character recognition apparatus according to claim 1, wherein said address control means controls said address control means to read the next word stored in said word dictionary without performing the calculation processing of said degree calculation means.

4. The character recognition device according to claim 1, wherein the ranking unit sequentially rearranges only a predetermined number of the most probable candidates for the cumulative similarity as a result of the word candidate storage unit. apparatus.

5. A word dictionary in which words are stored, and a character pattern storage unit that stores the number of extracted characters of the extracted character pattern data, each character similarity, each character connection number, and the number of valid word characters, What is claimed is: 1. A character recognition method for a character recognition apparatus for recognizing cut-out character pattern data in which each character of one word data is subdivided as one word, the number of effective word characters stored in said character pattern storage unit and the number of words If the result of comparison with the number of words and characters sequentially read from the dictionary is not within the number of characters range, a character number comparing step of reading the next word stored in the word dictionary; and each character stored in the character pattern storage unit. Based on each character similarity which is a similarity to each character of the character data and each character of the pattern data, and a comparison result of the character number comparing step A cumulative similarity calculating step of calculating how much the extracted character pattern data is similar to each word in the word dictionary, and storing the calculated cumulative similarity and the word candidate storage unit. When a word candidate is compared with the cumulative similarity, a calculation result comparison step of storing the cumulative similarity in the word candidate storage unit and a ranking step of sequentially rearranging the cumulative similarity stored in the word candidate storage unit are included. A character recognition method characterized in that:

6. In the step of calculating the cumulative similarity, each character data which is stored in the character pattern storage unit is defined as one character according to the number of character connections of each character pattern data obtained by subdividing one character. 6. The character recognition method according to claim 5, wherein a connection process is performed to calculate how similar each word in the word dictionary is from each character similarity stored in the character pattern storage unit.

7. In the character number comparing step, if the result of comparing the number of effective word characters stored in the character pattern storage unit with the number of word characters sequentially read from the word dictionary is not within the character number range, the cumulative similarity is calculated. 7. The character recognition method according to claim 5, wherein control is performed such that the next word stored in the word dictionary is read out without performing the calculation processing in the degree calculation step.

8. The method according to claim 5, wherein in the ranking step, only a predetermined number of the most probable candidates for the cumulative similarity stored in the word candidate storage step are rearranged in order. The character recognition method described in any of the above.

9. A word dictionary in which words are stored, and a character pattern storage unit that stores the number of cut-out characters of the cut-out character pattern data, each character similarity, each character connection number, and the number of valid word characters, A recording medium storing a character recognition program for controlling a character recognition device that recognizes cut-out character pattern data in which each character of one word data is subdivided as one word, the character recognition program comprising: If the comparison result between the number of valid word characters stored in the character pattern storage unit and the number of word characters sequentially read from the word dictionary is not within the range of the number of characters, a character number comparison for reading the next word stored in the word dictionary Steps and each sentence indicating the number of character connections of each character pattern data stored in the character pattern storage unit and the degree of similarity of each character in the character pattern A cumulative similarity calculating step for calculating, based on the character similarity and the comparison result of the character number comparing step, how much the extracted character pattern data is similar to each word in the word dictionary; Comparing the calculated cumulative similarity with the cumulative similarity stored in the word candidate storage unit as a word candidate, a calculation result comparison step of storing the result in the word candidate storage unit; and storing the calculated result in the word candidate storage unit. A ranking step of sequentially rearranging the obtained cumulative similarities.

10. In the step of calculating the cumulative similarity, each character data which is stored in the character pattern storage unit is set as one character according to the number of character connections of each character pattern data obtained by subdividing one character. 10. The recording medium according to claim 9, wherein the recording medium is connected to calculate how much the word is similar to each word in the word dictionary from each character similarity stored in the character pattern storage unit.

11. In the character number comparing step,
If the result of comparing the number of effective word characters stored in the character pattern storage unit with the number of word characters sequentially read from the word dictionary is not within the range of the number of characters, the word is not processed in the cumulative similarity calculation step and the word is not processed. 11. The recording medium according to claim 9, wherein control is performed to read a next word stored in the dictionary.

12. The method according to claim 9, wherein in the ranking step, only a predetermined number of the most probable candidates for the cumulative similarity stored in the word candidate storage step are rearranged in order. The recording medium according to any one of the above.