JPH07105321A

JPH07105321A - Word recognition device, address recognition device and word recognition method

Info

Publication number: JPH07105321A
Application number: JP5252925A
Authority: JP
Inventors: Kazuhiro Hasegawa; 和宏長谷川
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1993-10-08
Filing date: 1993-10-08
Publication date: 1995-04-21

Abstract

PURPOSE:To provide the word recognition device by which a word is quickly recognized. CONSTITUTION:An address recognition device 9 is made up of a word recognition device 17, an address recognition circuit 19 and an address dictionary 21. The word recognition device 17 is provided with a word recognition circuit 39 to reference a word dictionary 41 thereby recognizing a word and listing up word candidates. The processing to list up the word object is made by comparing a character candidate outputted from a character recognition circuit 35 with a word in the word dictionary 41 and the word recognition circuit 39 decides a character threshold level for each character in the word dictionary 41 and uses it as one of criteria and a dissident character number is counted for each character and it is compared with a maximum dissident character number decided by the word dictionary.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】この発明は、単語認識装置、住所
認識装置及び単語認識方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a word recognition device, an address recognition device and a word recognition method.

【０００２】[0002]

【従来の技術】従来の単語認識装置は、文字認識を行
い、文字認識の結果、複数の文字候補を文字評価値（文
字候補らしさによって定められる値）とともに挙げ、こ
の中から単語辞書を参照して単語を認識する。2. Description of the Related Art A conventional word recognition device recognizes a character and, as a result of the character recognition, lists a plurality of character candidates together with a character evaluation value (a value determined by the character candidate likeness), and refers to the word dictionary from these. To recognize words.

【０００３】上述した単語辞書には複数の単語とこれら
複数の単語とともに夫々対応する最大不一致文字数及び
しきい値が定められ登録されている。尚、最大不一致文
字数は、上述した文字認識の結果によって挙げられた文
字候補の中に単語を構成する文字として使用すべき候補
がない場合にカウントされる文字数の限度値であり、こ
の値を越えた単語は単語候補にならない。また、しきい
値は、単語認識をする際に単語として判断するうえでの
基準になり、単語を構成する文字の文字評価値の合計が
このしきい値に満たない単語は単語候補にならない。In the above-mentioned word dictionary, a plurality of words and the maximum number of non-matching characters and a threshold value corresponding to each of the plurality of words are defined and registered. The maximum number of non-matching characters is the limit value of the number of characters counted when there is no candidate that should be used as a character that constitutes a word among the character candidates listed by the result of the above-mentioned character recognition. Words that are not candidates for words. In addition, the threshold serves as a reference for determining a word when recognizing a word, and a word in which the sum of the character evaluation values of the characters forming the word does not reach the threshold is not a word candidate.

【０００４】このような単語認識装置では、単語辞書の
最大不一致文字数によるチエックを単語毎に構成する文
字全てについて認識した後に行うため、最低でも単語文
字数の回数だけ比較処理を行わなければならない。例え
ば単語辞書にて最大不一致文字数が０であると予め定め
られていると、文字列の１番目の文字と一致する文字候
補がなければ２番目の文字以降の比較処理は不要になる
にもかかわらず比較処理を行っていた。In such a word recognition device, since the check based on the maximum number of non-matching characters in the word dictionary is performed after recognizing all the characters constituting each word, the comparison process must be performed at least as many times as the number of word characters. For example, if it is predetermined in the word dictionary that the maximum number of non-matching characters is 0, the comparison process for the second and subsequent characters may be unnecessary if there is no character candidate that matches the first character of the character string. Instead, the comparison process was performed.

【０００５】更に単語のしきい値と単語を構成する文字
評価値合計とを比較するようになっているため、各文字
候補の文字評価値に変動がある場合でも一様に評価され
てしまうことになる。このため、単語候補の信頼性とし
ては低くなってしまっていた。Further, since the threshold value of the word is compared with the total character evaluation value forming the word, even if the character evaluation value of each character candidate varies, it is evaluated uniformly. become. For this reason, the reliability of the word candidates is low.

【０００６】[0006]

【発明が解決しようとする課題】上記したように、従来
の単語認識装置では、単語辞書の最大不一致文字数によ
るチエックを単語毎に構成する文字全てについて認識し
た後に行うため、最低でも単語文字数の回数だけ比較処
理を行わなければならない。例えば単語辞書にて最大不
一致文字数が０であると予め定められていると、文字列
の１番目の文字と一致する文字候補がなければ２番目の
文字以降の比較処理は不要になるにもかかわらず比較処
理を行っていた。As described above, in the conventional word recognition device, since the check based on the maximum number of mismatched characters in the word dictionary is performed after recognizing all the characters constituting each word, at least the number of word characters Only the comparison process must be performed. For example, if it is predetermined in the word dictionary that the maximum number of non-matching characters is 0, the comparison process for the second and subsequent characters may be unnecessary if there is no character candidate that matches the first character of the character string. Instead, the comparison process was performed.

【０００７】更に単語のしきい値と単語を構成する文字
評価値合計とを比較するようになっているため、各文字
候補の文字評価値に変動がある場合でも一様に評価され
てしまうことになる。このため、単語候補の信頼性とし
ては低くなってしまっていた。そこでこの発明は、上記
欠点を除去し、単語認識処理を迅速に行い得る単語認識
装置を提供することを目的とする。Further, since the threshold value of the word is compared with the sum of the character evaluation values forming the word, even if the character evaluation value of each character candidate varies, it is evaluated uniformly. become. For this reason, the reliability of the word candidates is low. Therefore, an object of the present invention is to provide a word recognition device that eliminates the above-mentioned drawbacks and can perform word recognition processing quickly.

【０００８】[0008]

【課題を解決するための手段】上記目的を達成するため
に、第１の発明では、紙葉類記載の単語を読取る読取手
段を備える。文字認識手段は、上記読取手段によって読
取られた上記紙葉類の単語を文字毎に認識し、この文字
毎に評価値を付与した複数の文字候補を出力する。記憶
手段は、上記紙葉類に記載されるべき複数の単語、これ
ら単語毎に対応する不一致文字数の許容限度値及び上記
単語の文字毎に定められたしきい値を記憶する。比較手
段は、上記記憶手段に記憶されている単語の各文字を上
記文字認識手段によって出力された文字候補と一致する
か否かを比較する。計数手段は、上記比較手段による比
較の際に、上記文字認識手段から出力された文字候補の
上記評価値が上記記憶手段に記憶された対応する文字の
しきい値よりも小さいとき、不一致文字であるとして計
数する。判別手段は、上記計数手段による計数の結果、
この計数値が上記記憶手段に記憶された不一致文字数の
許容限度値を越えるとき対応する単語でないと判別する
ことを特徴とする単語認識装置である。In order to achieve the above object, the first invention comprises a reading means for reading a word written on a paper sheet. The character recognition unit recognizes the word of the paper sheet read by the reading unit for each character, and outputs a plurality of character candidates to which an evaluation value is given for each character. The storage means stores a plurality of words to be written on the paper sheet, an allowable limit value of the number of non-matching characters corresponding to each word, and a threshold value set for each character of the word. The comparison means compares whether or not each character of the word stored in the storage means matches the character candidate output by the character recognition means. The counting means, when the comparison by the comparison means, when the evaluation value of the character candidate output from the character recognition means is smaller than the threshold value of the corresponding character stored in the storage means, a non-matching character. Count as if there is. The determining means is a result of counting by the counting means,
The word recognition device is characterized in that when the count value exceeds the allowable limit value of the number of non-matching characters stored in the storage means, the word is not a corresponding word.

【０００９】上記目的を達成するために、第２の発明で
は、紙葉類記載の単語を読取る読取手段を備える。文字
認識手段は、上記読取手段によって読取られた上記紙葉
類の単語を文字毎に認識し、この文字毎に評価値を付与
した複数の文字候補を出力する。記憶手段は、上記紙葉
類に記載されるべき複数の単語とこれら単語の文字毎に
定められたしきい値とを記憶する。比較手段は、上記記
憶手段に記憶されている単語の各文字を上記文字認識手
段から出力された文字候補と一致するか否かを比較す
る。判別手段は、上記比較手段による比較の際、上記文
字候補の評価値が上記記憶手段に記憶された対応する文
字のしきい値よりも小さいとき、不一致文字であると判
別することを特徴とする単語認識装置である。In order to achieve the above object, the second invention comprises a reading means for reading a word written on a paper sheet. The character recognition unit recognizes the word of the paper sheet read by the reading unit for each character, and outputs a plurality of character candidates to which an evaluation value is given for each character. The storage means stores a plurality of words to be written on the paper sheet and a threshold value determined for each character of these words. The comparison means compares each character of the word stored in the storage means with the character candidate output from the character recognition means. The determining means is characterized by determining that the character candidate is an unmatched character when the evaluation value of the character candidate is smaller than the threshold value of the corresponding character stored in the storage means during the comparison by the comparing means. It is a word recognition device.

【００１０】上記目的を達成するために、第３の発明で
は、紙葉類記載の単語を読取る読取手段を備えている。
文字認識手段は、上記読取手段によって読取られた上記
紙葉類の単語を文字毎に認識し、この文字毎に評価値を
付与した複数の文字候補を出力する。記憶手段は、上記
紙葉類に記載されるべき複数の単語及びこれら単語毎に
対応する不一致文字数の許容限度値を記憶する。判別手
段は、上記記憶手段に記憶される上記単語の最初の文字
から対応する上記複数の文字候補との比較を順に行い、
上記文字候補のうち上記単語と一致する文字候補がある
か否かを判別する。単語認識手段は、上記判別手段によ
る判別の結果、一致する文字候補があった場合、単語を
構成する文字として認識する。計数手段は、上記判別手
段による判別の結果、一致する文字候補がなかった場
合、その文字数を不一致文字であるとして計数する。判
別手段は、上記計数手段による計数の結果、この計数値
が前記記憶手段に記憶される不一致文字数の許容限度値
を越えると単語が認識できなかったとすることを特徴と
する単語認識装置である。In order to achieve the above object, the third invention comprises a reading means for reading a word written on a paper sheet.
The character recognition unit recognizes the word of the paper sheet read by the reading unit for each character, and outputs a plurality of character candidates to which an evaluation value is given for each character. The storage means stores a plurality of words to be written on the paper sheet and an allowable limit value of the number of mismatched characters corresponding to each word. The determining means sequentially compares the plurality of corresponding character candidates from the first character of the word stored in the storage means,
It is determined whether or not there is a character candidate that matches the word among the character candidates. As a result of the discrimination by the discriminating means, the word recognizing means recognizes as a character forming a word when there is a matching character candidate. When there is no matching character candidate as a result of the judgment by the judging means, the counting means counts the number of characters as an unmatched character. The discrimination means is a word recognition device characterized in that, as a result of the counting by the counting means, the word cannot be recognized when the counted value exceeds the allowable limit value of the number of mismatched characters stored in the storage means.

【００１１】上記目的を達成するために、第４の発明で
は、紙葉類記載の単語を読取る読取手段を備えている。
文字認識手段は、上記読取手段によって読取られた上記
紙葉類の単語を文字毎に認識し、この文字毎に評価値を
付与した複数の文字候補を出力する。記憶手段は、上記
紙葉類に記載されるべき複数の単語、これら単語毎に対
応する不一致文字数の許容限度値及び上記単語の文字毎
に定められたしきい値を記憶する。比較手段は、上記記
憶手段に記憶されている単語の各文字を上記文字認識手
段によって出力された文字候補と一致するか否かを比較
する。住所認識手段は、上記比較手段による比較の結
果、単語と文字候補とが一致する場合、この単語から住
所の認識を行う。計数手段は、上記比較手段による比較
の際に、上記文字認識手段から出力された文字候補の上
記評価値が上記記憶手段に記憶された対応する文字のし
きい値よりも小さいとき、不一致文字であるとして計数
する。判別手段は、上記計数手段による計数の結果、こ
の計数値が上記記憶手段に記憶された不一致文字数の許
容限度値を越えるとき対応する単語でないと判別するこ
とを特徴とする住所認識装置である。In order to achieve the above object, the fourth invention comprises a reading means for reading a word written on a paper sheet.
The character recognition unit recognizes the word of the paper sheet read by the reading unit for each character, and outputs a plurality of character candidates to which an evaluation value is given for each character. The storage means stores a plurality of words to be written on the paper sheet, an allowable limit value of the number of non-matching characters corresponding to each word, and a threshold value set for each character of the word. The comparison means compares whether or not each character of the word stored in the storage means matches the character candidate output by the character recognition means. The address recognition unit recognizes an address from the word when the word and the character candidate match as a result of the comparison by the comparison unit. The counting means, when the comparison by the comparison means, when the evaluation value of the character candidate output from the character recognition means is smaller than the threshold value of the corresponding character stored in the storage means, a non-matching character. Count as if there is. The discriminating means is an address recognition device characterized by discriminating that the word is not a corresponding word when the counted value exceeds the allowable limit value of the number of mismatched characters stored in the storage means as a result of the counting by the counting means.

【００１２】上記目的を達成するために、第５の発明で
は、単語辞書に記憶される単語と文字認識により得られ
た単語を構成する文字毎の候補とから単語を認識する単
語認識方法において、第１のステップで予め単語辞書に
単語、この単語の文字毎に定められたしきい値及び単語
の不一致文字数の許容限度値を記憶する。第２のステッ
プで、媒体記載の単語を読取る。第３のステップで、上
記第２のステップによって読取られた上記媒体の単語を
文字毎に認識し、この文字毎に評価値を付与した文字候
補を出力する。第４のステップで、上記第１のステップ
で記憶されている上記単語辞書の単語の文字毎のしきい
値と上記第３のステップで出力された文字候補の文字毎
の評価値とを比較する。第５のステップで、上記第４の
ステップによって上記評価値が前記しきい値以上である
と判断された上記単語の文字と上記文字候補とが一致す
るか否かを判断する。第６のステップで、上記第５のス
テップによって上記評価値が上記しきい値よりも小さい
と判断されたとき、上記単語の文字を不一致文字として
計数する。第７のステップで、上記第６のステップによ
って計数された上記不一致文字数が上記第１のステップ
で記憶されている上記不一致文字数の許容限度値よりも
大きいか否か判断する。第８のステップで、上記第７の
ステップでの判断の結果、上記不一致文字数が上記不一
致文字数の許容限度値よりも大きい場合にこの単語を対
応する単語でないと判別することを特徴とする単語認識
方法である。In order to achieve the above object, in the fifth invention, in a word recognition method for recognizing a word from a word stored in a word dictionary and a candidate for each character constituting a word obtained by character recognition, In a first step, a word, a threshold value determined for each character of this word, and an allowable limit value of the number of non-matching characters of the word are stored in advance in a word dictionary. In the second step, the words in the medium description are read. In the third step, the word of the medium read in the second step is recognized for each character, and a character candidate to which an evaluation value is given for each character is output. In a fourth step, the threshold value for each character of the word in the word dictionary stored in the first step is compared with the evaluation value for each character of the character candidate output in the third step. . In a fifth step, it is determined whether the character of the word whose evaluation value is determined to be equal to or more than the threshold value in the fourth step and the character candidate match. In the sixth step, when the evaluation value is determined to be smaller than the threshold value in the fifth step, the characters of the word are counted as non-matching characters. In a seventh step, it is determined whether or not the number of non-matching characters counted in the sixth step is larger than the allowable limit value of the number of non-matching characters stored in the first step. In the eighth step, as a result of the determination in the seventh step, when the number of non-matching characters is larger than the allowable limit value of the number of non-matching characters, it is determined that this word is not a corresponding word. Is the way.

【００１３】[0013]

【作用】このように構成された第１の発明に係わる単語
認識装置では、単語辞書にて文字毎に文字しきい値を定
めこれを判断基準にするとともに、更に文字毎に不一致
文字数を計数し単語辞書にて予め定められる最大不一致
文字数と比較して単語認識を行う。In the word recognition apparatus according to the first aspect of the present invention configured as described above, the character threshold is determined for each character in the word dictionary, and this is used as a criterion for determination, and the number of non-matching characters is counted for each character. Word recognition is performed by comparing with the maximum number of non-matching characters that is predetermined in the word dictionary.

【００１４】このように構成された第２の発明に係わる
単語認識装置では、単語辞書にて文字毎に文字しきい値
を定めこれを判断基準にする。このように構成された第
３の発明に係わる単語認識装置では、単語辞書にて予め
定められる最大不一致文字数との比較を単語の文字を認
識する度に行う。In the word recognition apparatus according to the second aspect of the present invention thus constructed, the character threshold value is determined for each character in the word dictionary, and this is used as the criterion. In the word recognition device according to the third aspect of the invention configured as described above, a comparison is made with the maximum number of non-matching characters set in the word dictionary each time a character of the word is recognized.

【００１５】このように構成された第４の発明に関わる
住所認識装置では、単語辞書にて文字毎に文字しきい値
を定めこれを判断基準にするとともに、更に文字毎に不
一致文字数を計数し単語辞書にて予め定められる最大不
一致文字数と比較して単語認識を行い、これらの判断に
基づき住所認識を行う。このように構成された第５の発
明に関わる単語認識方法では、文字毎の文字しきい値と
不一致文字数の計数から所定の単語を対応する単語とし
て認識しない。In the address recognition apparatus according to the fourth aspect of the present invention configured as above, a character threshold value is set for each character in the word dictionary, and this is used as a criterion for determination, and the number of mismatched characters is further counted for each character. The word recognition is performed by comparing with the maximum number of non-matching characters set in advance in the word dictionary, and the address recognition is performed based on these judgments. In the word recognition method according to the fifth aspect of the invention thus configured, a predetermined word is not recognized as a corresponding word based on the character threshold value for each character and the count of the number of mismatched characters.

【００１６】[0016]

【実施例】以下、本発明に係わる一実施例を図面を参照
して詳細に説明する。まず、図２を参照して本実施例の
単語認識装置を搭載しうる郵便物区分機の構成を説明す
る。図２は、本実施例の、郵便物区分機の正面図であ
る。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment according to the present invention will be described in detail below with reference to the drawings. First, with reference to FIG. 2, the configuration of a mail sorter capable of mounting the word recognition device of this embodiment will be described. FIG. 2 is a front view of the mail sorting machine according to the present embodiment.

【００１７】郵便物区分機１は供給部３を備え、この供
給部３からは、一括で且つ立位状態で葉書、封書等の郵
便物Ｐが供給される。この供給部３から供給される郵便
物Ｐは、取出部５へと搬送されるが、取出部５では郵便
物Ｐを一通づつ取り出し、取込搬送路７へと搬送する。
取込搬送路７は、ベルト及び複数のローラによって構成
されており、郵便物Ｐを挟持して搬送せしめる。The postal sorter 1 is provided with a supply unit 3 from which postal items P, such as postcards and sealed letters, are supplied collectively and in a standing position. The postal matter P supplied from the supply unit 3 is conveyed to the take-out unit 5, and the take-out unit 5 takes out the postal matter P one by one and conveys it to the take-in conveyance path 7.
The take-in / conveyance path 7 is composed of a belt and a plurality of rollers, and nips and conveys the postal matter P.

【００１８】取込搬送路７の上流側に住所認識装置９が
設置されている。この住所認識装置９は、単語認識装置
に住所判別の機能を付加させたものであり、郵便物Ｐ記
載の宛名情報を読取り、区分指定データを出力する。
尚、この住所認識装置９について詳細は後述する。An address recognition device 9 is installed on the upstream side of the take-in / transport path 7. The address recognition device 9 is a word recognition device provided with an address discrimination function, and reads the address information described in the mail P and outputs the classification designation data.
The address recognition device 9 will be described in detail later.

【００１９】住所認識装置９を通過した郵便物Ｐは、つ
いで区分部１１へと搬送される。区分部１１は、例えば
宛名別や郵便番号別に分割した複数の区分箱１３…から
構成されている。この区分箱１３…には、前述した住所
認識装置９からの区分指定データによって郵便物Ｐが区
分、集積される。The postal matter P passing through the address recognition device 9 is then conveyed to the sorting section 11. The sorting unit 11 is composed of, for example, a plurality of sorting boxes 13 ... Divided by address or postal code. Mail pieces P are sorted and accumulated in the sorting boxes 13 ... In accordance with the sorting designation data from the address recognition device 9 described above.

【００２０】郵便物Ｐのこの区分箱１３…への搬送は、
区分搬送路１５ａ乃至１５ｇによってなされる。次に図
３を参照して、本実施例の、住所認識装置９の構成を説
明する。図３は、本実施例の、住所認識装置のブロック
構成図である。The delivery of the postal matter P to this sorting box 13 ...
This is performed by the section conveyance paths 15a to 15g. Next, the configuration of the address recognition device 9 of the present embodiment will be described with reference to FIG. FIG. 3 is a block diagram of the address recognition device according to the present embodiment.

【００２１】住所認識装置９は、単語認識部１７と住所
認識回路１９及び住所辞書２１とからなっている。単語
認識部１７は、光電変換部２３を備えている。この光電
変換部２３は、図示しない光源により郵便物Ｐへと光を
照射した後に、反射光を受光してこの反射光を電気信号
へと変換するものである。The address recognition device 9 comprises a word recognition section 17, an address recognition circuit 19 and an address dictionary 21. The word recognition unit 17 includes a photoelectric conversion unit 23. The photoelectric conversion unit 23 is for irradiating the postal matter P with light by a light source (not shown), receiving the reflected light, and converting the reflected light into an electric signal.

【００２２】２値化回路２５は、光電変換部２３から出
力される郵便物Ｐの全面の画像の読取信号を所定のしき
い値にて２値化するものである。この２値化回路２５に
て２値化された郵便物Ｐ全面の画像の２値化信号は、宛
名領域検出部２７へと出力される。The binarization circuit 25 binarizes the read signal of the image of the entire surface of the postal matter P output from the photoelectric conversion unit 23 with a predetermined threshold value. The binarized signal of the image of the entire surface of the postal matter P binarized by the binarization circuit 25 is output to the address area detection unit 27.

【００２３】宛名領域検出部２７は、射影法等によって
郵便物Ｐの画像のうち宛名領域を座標によって指定し、
検出するものである。検出した後、所定の範囲が宛名領
域である旨の信号を出力する。The address area detection unit 27 specifies the address area in the image of the mail P by coordinates by the projection method or the like,
It is something to detect. After the detection, a signal indicating that the predetermined range is the address area is output.

【００２４】選択回路２９は、宛名領域検出部２７から
の出力信号を入力し、２値化回路２５から入力する２値
化信号のうち、宛名領域の２値化信号のみを選択的に出
力する。The selection circuit 29 inputs the output signal from the address area detection unit 27 and selectively outputs only the binarized signal of the address area among the binarized signals input from the binarization circuit 25. .

【００２５】文字検出切出回路３１は、選択回路２９か
ら入力する宛名領域内の２値化信号から例えば行検出、
切出しを行い、更に推定ピッチにより文字を一文字ずつ
検出し、切出しを行って一文字毎の信号を出力する。The character detection / cutout circuit 31 detects, for example, a line from the binarized signal in the address area input from the selection circuit 29.
The characters are cut out, characters are detected character by character according to the estimated pitch, and the characters are cut out and a signal for each character is output.

【００２６】正規化回路３３は、文字検出切出回路３１
から入力する信号から各文字を一定の大きさに正規化す
るものである。ここで正規化された信号は、文字認識回
路３５へと出力される。The normalization circuit 33 includes a character detection / cutout circuit 31.
Each character is normalized to a certain size from the signal input from. The signal normalized here is output to the character recognition circuit 35.

【００２７】文字認識回路３５は、文字辞書３７と接続
されているが、この文字辞書３７には多数の文字が登録
されている。文字認識回路３５は、この文字辞書３７を
参照して文字認識を行う。この文字認識は、例えば文字
辞書３７では登録される文字の基準パターンとのマッチ
ングによって行い、文字候補を挙げる。尚、本実施例の
文字認識によっても図２を参照して説明した文字候補と
同一の文字候補が挙げられるため、この文字候補につい
て説明は省略する。この文字認識回路３９は、この文字
候補の情報を単語認識回路３９へと出力する。The character recognition circuit 35 is connected to a character dictionary 37, and a large number of characters are registered in this character dictionary 37. The character recognition circuit 35 refers to the character dictionary 37 to perform character recognition. This character recognition is performed by matching with a reference pattern of a character registered in the character dictionary 37, for example, and character candidates are listed. Since the same character candidates as the character candidates described with reference to FIG. 2 can be mentioned by the character recognition of this embodiment, the description of the character candidates will be omitted. The character recognition circuit 39 outputs the information of the character candidates to the word recognition circuit 39.

【００２８】単語認識回路３９は、単語辞書４１と接続
されているが、この単語辞書４１には多数の単語が登録
されている。単語認識回路３９では、この単語辞書４１
を参照して単語認識を行って、単語候補を挙げる。この
単語辞書４１及び単語認識の流れについて詳細は後述す
る。単語認識回路３９はこの単語候補の情報を住所認識
回路１９へと出力する。The word recognition circuit 39 is connected to a word dictionary 41, and a large number of words are registered in this word dictionary 41. In the word recognition circuit 39, this word dictionary 41
The word recognition is performed with reference to to list word candidates. The details of the word dictionary 41 and the flow of word recognition will be described later. The word recognition circuit 39 outputs the information of this word candidate to the address recognition circuit 19.

【００２９】住所認識回路１９は、住所辞書２１と接続
されているが、住所辞書２１には実在する住所が登録さ
れている。住所認識回路１９は、この住所辞書２１を参
照して単語候補から住所認識を行い、区分指定データを
出力する。この区分指定データは、前述したように郵便
物区分機１の区分箱１３…への区分、集積に用いられ
る。Although the address recognition circuit 19 is connected to the address dictionary 21, the existing address is registered in the address dictionary 21. The address recognition circuit 19 refers to the address dictionary 21 to recognize an address from the word candidates and outputs the classification designation data. This sorting designation data is used for sorting and stacking the sorting boxes 13 in the mail sorting machine 1 as described above.

【００３０】次に図４を参照して、本実施例の文字認識
回路３５によって挙がった文字候補を説明する。図４
は、本実施例の、文字候補を示す図面である。図４は、
５文字について文字認識を行った結果、挙げられた文字
候補を示すが、このように文字番号が１番のものは第１
位候補は文字が“カ”で得点が１００、第２位候補は文
字が“ケ”で得点が９０、以下各順位について文字及び
得点が定められ、第Ｎ位候補は文字が“サ”で得点が２
０である。また、同様に文字番号が２番、３番、４番及
び５番夫々について文字及び得点が定められる。Next, referring to FIG. 4, the character candidates picked up by the character recognition circuit 35 of this embodiment will be described. Figure 4
[FIG. 6] is a drawing showing character candidates of the present embodiment. Figure 4
As a result of character recognition of 5 characters, the character candidates listed are shown.
The character of the rank candidate is “K” and the score is 100, the character of the second rank is “K” and the score is 90, and the character and the score are determined for each rank. The Nth candidate is the character “SA”. Score 2
It is 0. Similarly, a character and a score are determined for each of the character numbers 2, 3, 4, and 5.

【００３１】尚、この文字と得点との相関は一定のもの
でない。すなわち、文字番号が１番で第２位候補に文字
が“ケ”で得点が９０と挙がっているが、文字番号が２
番で第２位候補のものも文字が“ケ”であるが得点は９
５であるように、認識の対象の文字によっては同一の文
字が候補として挙がったとしても得点が異なる。The correlation between this character and the score is not constant. That is, although the character number is 1 and the second candidate is the character “ke” and the score is 90, the character number is 2
The number 2 candidate is also "Ke" but the score is 9
As shown in No. 5, even if the same character is listed as a candidate, the score differs depending on the character to be recognized.

【００３２】次に図５を参照して本実施例の、単語辞書
４１の説明をする。図５は、本実施例の、単語認識の際
に使用する単語辞書の一例である。図５に示すように単
語辞書４１では各単語番号に対応して文字列、文字しき
い値、最大不一致文字数及びしきい値が定められ、情報
として登録されている。すなわち、単語番号がＡである
ものは、文字列として“カワサキシ”が登録され、文字
しきい値は夫々の文字に対応して５０、最大不一致文字
数は０が登録され、しきい値として２５０が登録されて
いる。同様に単語番号Ｂは、文字列、文字しきい値、最
大不一致文字数、しきい値が夫々“カワサキク”、５
０、０、２５０、単語番号Ｃは、夫々“サイワイク”、
０、１、２５０が登録されている。Next, the word dictionary 41 of this embodiment will be described with reference to FIG. FIG. 5 is an example of a word dictionary used in word recognition according to the present embodiment. As shown in FIG. 5, in the word dictionary 41, a character string, a character threshold value, a maximum non-matching character number, and a threshold value are determined corresponding to each word number and registered as information. That is, if the word number is A, "Kawasaki" is registered as the character string, the character threshold value is 50 corresponding to each character, and the maximum non-matching character number is 0, and the threshold value is 250. It is registered. Similarly, the word number B is a character string, a character threshold value, a maximum number of non-matching characters, and a threshold value of "Kawasakiku" and 5 respectively.
0, 0, 250, and the word number C are "Cywaik",
0, 1, and 250 are registered.

【００３３】文字しきい値は、文字列夫々に対応して定
められ、文字候補の得点と比較するものである。すなわ
ち、この文字しきい値以上である得点の文字候補に対し
ては単語辞書４１との一致の判断がなされるが、この文
字しきい値よりも得点が低い文字候補に対しては単語辞
書４１との比較はなされない。The character threshold value is determined corresponding to each character string and is compared with the score of the character candidate. That is, it is determined that a character candidate having a score equal to or higher than this character threshold matches the word dictionary 41, but a character dictionary having a score lower than this character threshold has a word dictionary 41. No comparison is made with.

【００３４】この文字しきい値は、ここでは全ての文字
に対して５０と一様に定められているが、単語認識の際
の誤認識の度合等からいもちろん文字毎に異なった値で
も良い。This character threshold value is uniformly set to 50 for all characters here, but of course it may be a different value for each character depending on the degree of erroneous recognition at the time of word recognition. .

【００３５】次に図１、図４及び図５を参照して、本実
施例の単語認識回路３９による単語認識の作用を説明す
る。尚、本実施例は、５文字を１単語として認識する場
合を例にとって説明する。Next, the operation of word recognition by the word recognition circuit 39 of this embodiment will be described with reference to FIGS. The present embodiment will be described by taking the case where five characters are recognized as one word.

【００３６】図１は、本実施例の、単語認識の方法を示
すフローチャートである。まず、前提として本方式は、
単語番号毎に処理を行うものであり、平行した処理を行
うものではない。FIG. 1 is a flowchart showing the word recognition method of this embodiment. First, as a premise, this method is
The processing is performed for each word number, and the processing is not performed in parallel.

【００３７】まず初期化として文字評価値合計（得点合
計）を０にし、更に不一致文字数を０にする（ステップ
１）。続いて単語辞書４１に登録される文字列のうちｉ
番目の文字を取り出す（ステップ２）。すなわち単語認
識の際、１番目に登録される文字から順次取り出してい
くため、ｉは１であるとして以後説明するが、図５で示
す単語番号Ａの場合は“カ”を取り出す。First, as initialization, the total character evaluation value (total score) is set to 0, and the number of mismatched characters is set to 0 (step 1). Then, among the character strings registered in the word dictionary 41, i
Retrieve the th character (step 2). That is, when recognizing a word, since the characters are sequentially extracted from the first registered character, it will be described below that i is 1. However, in the case of the word number A shown in FIG.

【００３８】ここで続いて文字候補のうち第ｊ位の文字
候補を取り出す（ステップ３）。すなわち、ステップ１
で挙げられた文字に対応する文字の第ｊ位の文字候補を
取り出す。単語を形成するうえで確からしい文字を選択
する必要があるため、第１位候補から順次認識するとい
うことでｊは１であるとして以後説明するが、ステップ
１で１番目の文字が挙げられたため、この文字に対応す
る第１位候補を挙げる。尚、図４で示す文字候補では第
１位候補として“カ”を取り出す。Then, the jth character candidate is taken out of the character candidates (step 3). That is, step 1
The j-th character candidate of the character corresponding to the character listed in (1) is extracted. Since it is necessary to select a certain character to form a word, it will be explained below that j is 1 by sequentially recognizing from the first candidate, but since the first character was mentioned in step 1. , The first candidate corresponding to this character is listed. In the character candidates shown in FIG. 4, "ka" is taken out as the first candidate.

【００３９】この後にステップ２で単語辞書４１から取
り出した文字しきい値とステップ３で取り出した文字候
補の得点とを比較する（ステップ４）。すなわち、文字
候補の得点が単語辞書４１からの文字しきい値以上か否
かの判断がなされる。Thereafter, the character threshold value extracted from the word dictionary 41 in step 2 is compared with the score of the character candidate extracted in step 3 (step 4). That is, it is determined whether the score of the character candidate is equal to or higher than the character threshold value from the word dictionary 41.

【００４０】ステップ４にてしきい値以上であると認識
されると、これらの文字が一致するか否かが判断される
（ステップ５）。ここで一致すると判断されると、この
第ｊ位の文字候補の得点をｉ番目の文字の文字評価値と
して文字評価値合計として加算する（ステップ６）。こ
の例では、文字候補の第１位候補“カ”で得点が１００
であり、単語辞書４１での単語番号Ａの１番目の文字
“カ”で文字しきい値は５０であるため、ステップ４、
５、６を踏み、文字評価値として１００が加算される。
この文字評価値合計は、単語認識回路３９にて単語番号
毎に記憶される。When it is recognized in step 4 that the value is equal to or more than the threshold value, it is determined whether these characters match (step 5). If it is determined that they match, the score of the jth character candidate is added as the character evaluation value as the character evaluation value of the i-th character (step 6). In this example, the first character candidate “K” has a score of 100.
And the character threshold value is 50 for the first character “KA” of the word number A in the word dictionary 41, the step 4,
Stepping 5 and 6, 100 is added as the character evaluation value.
This total character evaluation value is stored in the word recognition circuit 39 for each word number.

【００４１】このステップ６の後に、ステップ２へと戻
り、続く単語辞書４１の文字列を取り出す。すなわち、
図５で示す単語番号Ａの場合は“ワ”を取り出す。尚ス
テップ５での結果、文字が一致しない場合には、ステッ
プ３にて続く文字候補を取り出す。すなわち、ｊ＋１番
目の文字候補を取り出すが、図４で示す文字候補の図面
では“ケ”を取り出し、ステップ４でのしきい値との比
較がなされれ、ステップ５で文字が一致するか否かが判
断される。After this step 6, the process returns to step 2 and the character string of the succeeding word dictionary 41 is taken out. That is,
In the case of the word number A shown in FIG. 5, "wa" is taken out. If the characters do not match as a result of step 5, the subsequent character candidates are extracted in step 3. That is, the j + 1th character candidate is taken out, but in the drawing of the character candidate shown in FIG. 4, “ke” is taken out and compared with the threshold value in step 4, and whether or not the character matches in step 5 is determined. Is judged.

【００４２】こうしてステップ３、ステップ４及びステ
ップ５を繰り返すうち、ステップ４にてステップ３で挙
げられたｉ番目の文字の文字しきい値が第ｊ位の文字候
補の得点よりも小さいと判断された場合、このｉ番目の
文字は不一致文字であるとして、不一致文字数に１を加
算する（ステップ７）。この不一致文字の加算は、ステ
ップ３、ステップ４及びステップ５にて文字候補の得点
が文字しきい値以上であるが、第Ｎ位候補まで比較した
結果一致する文字がない場合にもなされる。この不一致
文字数の総計は単語認識回路３９にて単語番号毎に記憶
される。While repeating steps 3, 4 and 5, it is determined in step 4 that the character threshold value of the i-th character listed in step 3 is smaller than the score of the j-th character candidate. If the i-th character is a non-matching character, 1 is added to the number of non-matching characters (step 7). The addition of the non-matching characters is also performed when the score of the character candidate is equal to or higher than the character threshold in steps 3, 4 and 5, but there is no matching character as a result of comparison up to the Nth candidate. The total number of unmatched characters is stored in the word recognition circuit 39 for each word number.

【００４３】続いてステップ７で加算された最大不一致
文字数の総計が単語辞書４１で予め定められた不一致文
字数以下か否か判断がされる（ステップ８）。ステップ
８にて予め定められた最大不一致文字数以下であると判
断されると、ステップ２に戻って単語辞書４１の続く文
字列を取り出す。Subsequently, it is judged whether or not the total number of the maximum non-matching characters added in step 7 is less than or equal to the predetermined non-matching character number in the word dictionary 41 (step 8). If it is determined in step 8 that the number of characters is less than or equal to the predetermined maximum number of non-matching characters, the process returns to step 2 and the subsequent character string in the word dictionary 41 is extracted.

【００４４】また、ステップ８にて予め定められた最大
不一致文字数よりも大きいと判断されると、単語候補が
得られなかったと判断する（ステップ９）。以後、この
ように判断された単語番号に対しては、文字の判断をし
ない。If it is determined in step 8 that the number of characters is larger than the predetermined maximum number of non-matching characters, it is determined that no word candidate is obtained (step 9). Thereafter, the word number thus judged is not judged as a character.

【００４５】こうしてステップ２で単語辞書４１の文字
列の最後まで処理が終わったとされると、この文字候補
で挙げられた文字の不一致文字数の総計と単語辞書で予
め定められる最大不一致文字数との比較がなされる（ス
テップ１０）。In this way, when the processing is completed up to the end of the character string in the word dictionary 41 in step 2, the total number of non-matching characters of the characters listed in this character candidate is compared with the maximum number of non-matching characters predetermined in the word dictionary. Is performed (step 10).

【００４６】ステップ１０の結果、総計が最大不一致文
字数以下であると、続いて文字評価値合計と単語辞書で
予め定められるしきい値とが比較される（ステップ１
１）。ステップ１１の結果、文字評価値合計がしきい値
以上であると、単語候補が得られたとし（ステップ１
２）、処理が終了する。If the result of step 10 is that the total is less than or equal to the maximum number of non-matching characters, then the total character evaluation value is compared with a threshold value predetermined in the word dictionary (step 1
1). As a result of step 11, if the total character evaluation value is equal to or more than the threshold value, it is assumed that a word candidate is obtained (step 1
2), the process ends.

【００４７】ステップ１０の結果、総計が最大不一致文
字数よりも大きい場合及びステップ１１の結果、文字評
価値合計がしきい値より小さい場合には、ステップ９に
て単語候補が得られなかったとし、処理が終了する。If the result of step 10 is that the total is larger than the maximum number of non-matching characters, and if the result of step 11 is that the total character evaluation value is smaller than the threshold value, then no word candidate is obtained in step 9, The process ends.

【００４８】この後、続いて単語番号Ｂについて同様に
処理がなされる。このような方法であると、例えば単語
番号Ｃの処理を鑑みた場合、文字列のうち１番目の
“サ”と２番目の“イ”とが文字候補中にはあるが、夫
々得点が２０と３０とであり、しきい値５０よりも小さ
いため、ステップ７にて夫々不一致文字数にカウントさ
れる。すなわち、不一致文字数が２であり、単語番号Ｃ
の最大不一致文字数１よりも大きいため、２番目の文字
の選択の段階で単語候補としてＣが得られないことが判
断され、以降の文字列に対してアクセスする必要がな
い。すなわち、単語番号Ｃについては、記憶される不
一致文字数についてのみ判断すればよい。Thereafter, the same processing is subsequently performed for the word number B. According to such a method, for example, in consideration of the processing of the word number C, the first "sa" and the second "a" in the character string are in the character candidates, but the scores are 20 respectively. And 30, which are smaller than the threshold value 50, are counted as the number of non-matching characters in step 7, respectively. That is, the number of unmatched characters is 2, and the word number C
Since the maximum number of non-matching characters is larger than 1, it is determined that C cannot be obtained as a word candidate at the stage of selecting the second character, and it is not necessary to access subsequent character strings. That is, for the word number C, only the number of stored non-matching characters needs to be determined.

【００４９】こうして、挙げられた単語らしい候補の中
から住所認識回路１９にて住所として適する単語を選択
する。また、しきい値や最大不一致文字数の値によって
は、一意に単語を特定することもでき、住所認識回路１
９での処理が迅速になる。In this way, the address recognition circuit 19 selects a word suitable as an address from the listed candidates for words. Further, the word can be uniquely specified depending on the threshold value or the value of the maximum number of non-matching characters.
The process at 9 becomes faster.

【００５０】このように、文字毎に文字しきい値を定め
これを判断基準の１つにし、更に文字毎に不一致文字数
を計数し、単語辞書にて定められる最大不一致文字数と
の比較を行っていくため、単語認識が迅速に行われる。
尚、本発明は上記実施例に限定されるものではなく、本
願発明の趣旨を変更しない程度に種々変形可能である。In this way, the character threshold value is set for each character, and this is set as one of the judgment criteria. Further, the number of non-matching characters is counted for each character, and the maximum number of non-matching characters defined in the word dictionary is compared. Therefore, word recognition is performed quickly.
The present invention is not limited to the above embodiment, but can be variously modified without changing the gist of the present invention.

【００５１】[0051]

【発明の効果】以上説明したように、本発明によれば文
字毎に文字しきい値を定めこれを判断基準の１つにし、
更に文字毎に不一致文字数を計数し、単語辞書にて定め
られる最大不一致文字数との比較を行っていくため、単
語認識が迅速に行われる。As described above, according to the present invention, the character threshold value is set for each character and this is set as one of the judgment criteria.
Furthermore, since the number of non-matching characters is counted for each character and compared with the maximum number of non-matching characters defined in the word dictionary, word recognition is performed quickly.

[Brief description of drawings]

【図１】本実施例の、単語認識の方法を示すフローチャ
ートである。FIG. 1 is a flowchart showing a word recognition method according to this embodiment.

【図２】本実施例の、郵便物区分機の正面図である。FIG. 2 is a front view of a mail sorting machine according to the present embodiment.

【図３】本実施例の、住所認識装置のブロック構成図で
ある。FIG. 3 is a block configuration diagram of an address recognition device according to the present embodiment.

【図４】本実施例の、文字候補を示す図面である。FIG. 4 is a diagram showing character candidates according to the present embodiment.

【図５】本実施例の、単語認識の際に使用する単語辞書
の一例である。FIG. 5 is an example of a word dictionary used in word recognition according to the present embodiment.

[Explanation of symbols]

１郵便物区分機９住所認識装置１７単語認識部３５文字認識回路３７文字辞書３９単語認識回路４１単語辞書 DESCRIPTION OF SYMBOLS 1 Mail sorter 9 Address recognition device 17 Word recognition unit 35 Character recognition circuit 37 Character dictionary 39 Word recognition circuit 41 Word dictionary

Claims

[Claims]

1. A reading unit for reading a word written on a paper sheet, and a plurality of character candidates for recognizing the word of the paper sheet read by the reading unit for each character and giving an evaluation value for each character. A memory for storing a plurality of words to be described on the paper sheet, an allowable limit value of the number of mismatched characters corresponding to each of these words, and a threshold value determined for each character of the word. Means, comparing means for comparing each character of the word stored in the storing means with the character candidate output by the character recognizing means, and the comparing means for comparing the characters with the character When the evaluation value of the character candidate output from the recognition means is smaller than the threshold value of the corresponding character stored in the storage means, the counting means counts the character as a non-matching character, and the result of the counting by the counting means. , Word recognition device characterized in that the count value is composed of, and discriminating means for discriminating a non-corresponding word when exceeds the allowable limit value of the stored mismatched characters in the storage means.

2. A reading means for reading a word written on a paper sheet, and a plurality of character candidates for recognizing each word of the paper sheet read by the reading means and giving an evaluation value for each character. A character recognition means for outputting, a storage means for storing a plurality of words to be written on the paper sheet and a threshold value determined for each character of these words, and a word stored in this storage means Comparing means for comparing whether or not each character of the character candidate matches the character candidate output from the character recognizing means, and the evaluation value of the character candidate stored in the storing means when the comparison is performed by the comparing means. A word recognition device comprising: a discriminating means for discriminating a non-matching character when it is smaller than a threshold value of the character to be recognized.

3. A reading unit for reading a word written on a paper sheet, and a plurality of character candidates for recognizing the word of the paper sheet read by the reading unit for each character and giving an evaluation value for each character. A character recognition means for outputting, a storage means for storing a plurality of words to be described on the paper sheet and an allowable limit value of the number of non-matching characters corresponding to each of the words, and a storage means for storing the words stored in the storage means. A determination unit that determines whether or not there is a character candidate that matches the word among the character candidates by sequentially comparing the plurality of corresponding character candidates from the first character, and the result of the determination by the determination unit, If there is a matching character candidate, the word recognition means for recognizing as a character forming a word, and if there is no matching character candidate as a result of the judgment by the judging means, the number of characters is regarded as a non-matching character Counting means for counting, and as a result of counting by the counting means, a judging means for recognizing that the word cannot be recognized when the counted value exceeds the allowable limit value of the number of non-matching characters stored in the storage means. And a word recognition device.

4. A reading unit for reading a word written on a paper sheet, and a plurality of character candidates for recognizing the word of the paper sheet read by the reading unit for each character and giving an evaluation value for each character. A memory for storing a plurality of words to be described on the paper sheet, an allowable limit value of the number of mismatched characters corresponding to each of these words, and a threshold value determined for each character of the word. Means, comparing means for comparing each character of the word stored in the storing means with the character candidate output by the character recognizing means, and the result of the comparison by the comparing means, the word and the character When the candidates match, the evaluation value of the character candidate output from the character recognition means is stored in the storage means when the address recognition means for recognizing the address from this word and the comparison means are compared. A pair When the count value is smaller than the threshold value of the characters to be counted, the counting means counts the characters as non-matching characters, and as a result of the counting by the counting means, when the count value exceeds the allowable limit value of the number of non-matching characters stored in the storage means. An address recognition device comprising: a determination unit that determines that the word is not a corresponding word.

5. A word recognition method for recognizing a word from a word stored in a word dictionary and a candidate for each character constituting the word obtained by character recognition, wherein the word is previously stored in the word dictionary and each character of this word is recognized. A first step of storing a predetermined threshold value and a permissible limit value of the number of non-matching characters of the word; a second step of reading the word written in the medium; and a step of reading the word of the medium read by the second step. A third step of recognizing each character and outputting a character candidate to which an evaluation value is given for each character; a threshold value for each character of the words in the word dictionary stored in the first step; A fourth step of comparing with the evaluation value for each character of the character candidates output in the third step; and of the word for which the evaluation value is judged to be equal to or greater than the threshold value by the fourth step. Letters and A fifth step of determining whether or not the character candidates match, and when the evaluation value is determined to be smaller than the threshold value by the fifth step, the character of the word is regarded as a non-matching character. A sixth step of counting, and a seventh step of judging whether or not the number of non-matching characters counted by the sixth step is larger than an allowable limit value of the number of non-matching characters stored in the first step. And, as a result of the judgment in the seventh step, if the number of non-matching characters is larger than the allowable limit value of the number of non-matching characters, the eighth step of judging that this word is not a corresponding word, How to recognize words.