JPH05151384A

JPH05151384A - Correcting method for recognition character

Info

Publication number: JPH05151384A
Application number: JP3310482A
Authority: JP
Inventors: Tamotsu Maeda; 保前田
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1991-11-26
Filing date: 1991-11-26
Publication date: 1993-06-18

Abstract

PURPOSE:To execute a correction by making the memory capacity for storing correcting information small, and decreasing the number of execution times of similar operations, in a correction processing of a recognized character. CONSTITUTION:By an image input part 1, a character pattern is inputted, and by a character recognizing part 2, a reject code for giving a feature quantity of the pattern, a candidate character and erroneous recognition information is detected, from the reject code, with regard to only the pattern whose possibility of erroneous recognition is high, the feature quantity, and the candidate character and a character pattern coordinate are stored in a feature quantity memory 5v and a reject information memory 9, respectively. The candidate character of the reject information memory 9 is displayed as a correction object on a display part 4, and corrected after an operator's confirmation. Also, a candidate character of the pattern being similar to its corrected candidate character is subjected to similar operation, and extracted and displayed, and it is also corrected after the operator's confirmation. By storing only the information of the pattern whose possibility of erroneous recognition is high, the storage capacity is curtailed.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は文字認識の際の認識文字
の修正方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a method of correcting recognized characters in character recognition.

【０００２】[0002]

【従来の技術】近年、文字認識装置が開発されている
が、その認識精度が課題であり、認識した文字を修正す
る機能を備えたものが多い。2. Description of the Related Art In recent years, character recognition devices have been developed, but their recognition accuracy is a problem, and many of them have a function of correcting recognized characters.

【０００３】以下、従来の文字認識装置における認識文
字の修正方法について図面を参照しながら説明する。図
５は従来の文字修正が可能な文字認識装置の構成をブロ
ック図で示す。図において、文字認識部２から出力され
た認識文字の文字コード、文字パターン、特徴量および
リジェクトコードをそれぞれ文字コードメモリ３、文字
パターンメモリ１０、特徴量メモリ５およびリジェクト
情報メモリ９に文字パターンの数だけ記憶する。つぎ
に、誤認識の可能性の高い認識文字とその文字パターン
を文字コードメモリ３と文字パターンメモリ１０から読
み出し、表示部４で操作者に提示する。操作者がこの文
字コードを修正するために正しい文字コードを修正情報
入力部６から入力した場合、および操作者が修正を指示
しない文字パターンであっても特徴量メモリ５中の文字
パターンの特徴量と他の文字パターンの特徴量との類似
度を演算により求め、所定値より大きい類似度の文字パ
ターンと文字コードを操作者に表示し、その文字コード
を修正するか否かを確認部８で操作者に対して確認さ
せ、操作者が修正を許可したときに修正を行なうように
する。A method of correcting a recognized character in a conventional character recognition device will be described below with reference to the drawings. FIG. 5 is a block diagram showing the structure of a conventional character recognition device capable of correcting characters. In the figure, the character code, the character pattern, the feature amount, and the reject code of the recognized character output from the character recognition unit 2 are stored in the character code memory 3, the character pattern memory 10, the feature amount memory 5, and the reject information memory 9, respectively. Remember only the number. Next, the recognized character and the character pattern thereof having a high possibility of being erroneously recognized are read from the character code memory 3 and the character pattern memory 10 and presented to the operator on the display unit 4. When the operator inputs a correct character code to correct this character code from the correction information input unit 6, and even when the operator does not instruct the correction, the feature amount of the character pattern in the feature amount memory 5 And the similarity with the feature amount of another character pattern are calculated, the character pattern and the character code having a similarity greater than a predetermined value are displayed to the operator, and the confirmation unit 8 determines whether or not to correct the character code. The operator is asked to confirm, and the correction is made when the operator permits the correction.

【０００４】[0004]

【発明が解決しようとする課題】このような従来の認識
文字の修正方法では、すべての文字について特徴量やリ
ジェクトコードをメモリに記憶しておかねばならないの
で、大きなメモリ容量が必要であり、また、誤認識の可
能性が低い文字でも一律に類似度演算を行なうので処理
時間が長くかかるという問題点があった。In such a conventional recognition character correction method, a feature amount and a reject code must be stored in a memory for all characters, so that a large memory capacity is required. However, there is a problem that the processing time is long because the similarity calculation is uniformly performed even for a character having a low possibility of being erroneously recognized.

【０００５】本発明は上記の課題を解決するもので、記
憶メモリ容量が小さくてすみ、類似度演算の回数の少な
くてすむ認識文字の修正方法を提供することを目的とす
る。An object of the present invention is to solve the above problems, and an object thereof is to provide a method of correcting a recognized character which requires only a small storage memory capacity and requires a small number of times of similarity calculation.

【０００６】[0006]

【課題を解決するための手段】本発明は上記の目的を達
成するために、画像入力手段により文字のパターンを読
み取り、文字認識手段によりそのパターンの特徴量を抽
出するとともに、前記特徴量から文字コードおよびリジ
ェクトコードを求め、前記文字コードを第１の記憶手段
に記憶し、前記リジェクトコードにより誤認識の可能性
が高いと判別した特定文字の特徴量および修正に要する
情報とを第２の記憶手段に記憶し、文字修正処理におい
て、前記第２の記憶手段の前記特定文字を表示手段によ
り操作者に表示し、操作者が表示された特定文字のうち
の任意の第１の文字を他の第２の文字に修正したとき、
前記第２の記憶手段中の文字であって前記第１の文字に
所定値以上に類似するものを類似演算により求めて表示
し、その文字を操作者が確認して前記第２の文字または
他の文字に修正するようにした認識文字の修正方法であ
る。In order to achieve the above-mentioned object, the present invention reads a character pattern by image input means, extracts the feature quantity of the pattern by character recognition means, and extracts the character from the feature quantity. A code and a reject code are obtained, the character code is stored in the first storage means, and a characteristic amount of a specific character determined to have a high possibility of being misrecognized by the reject code and information required for correction are stored in the second storage. In the character correction process, the specific character of the second storage means is displayed to the operator by the display means, and the operator selects an arbitrary first character of the displayed specific characters from other characters. When you change to the second character,
Characters in the second storage means, which are similar to the first character with a predetermined value or more, are obtained by a similarity calculation and displayed, and the operator confirms the character to display the second character or other characters. This is a method of correcting the recognized character so that the character is corrected to.

【０００７】[0007]

【作用】本発明は上記の構成において、第２の記憶手段
が誤認識の可能性の高い特定文字の特徴量および修正に
要する情報を記憶し、文字修正処理において前記特定文
字を操作者に表示し、操作者がその特定文字のうちの任
意の第１の文字を他の第２の文字に変更したとき、第２
の記憶手段中の文字であって第１の文字に類似した文字
を求めて表示し、操作者の選択で第２の文字または他の
文字に修正する。According to the present invention, in the above configuration, the second storage means stores the characteristic amount of the specific character having a high possibility of being erroneously recognized and the information required for the correction, and displays the specific character to the operator in the character correction process. Then, when the operator changes any first character of the specific characters to another second character, the second character
A character in the storage means that is similar to the first character is sought and displayed, and is corrected to the second character or another character by the operator's selection.

【０００８】[0008]

【Example】

（実施例１）以下、本発明の一実施例の認識文字の修正
方法について図面を参照しながら説明する。(Embodiment 1) Hereinafter, a method of correcting a recognized character according to an embodiment of the present invention will be described with reference to the drawings.

【０００９】図１は本発明の一実施例の認識文字の修正
方法を用いた文字認識装置の構成をブロック図で示す。
図において、１は文書を光電変換して２値化データを文
書パターンメモリ１１に出力する画像入力部、２は上記
２値化データに前処理、特徴抽出、特徴量から文字コー
ドおよびリジェクトコードを求めるマッチングなどの文
字認識処理を行い、文書パターンメモリ１１における文
字パターンの座標、特徴量、候補文字およびリジェクト
情報を出力する文字認識部、３は文字認識部２で認識さ
れた候補文字を蓄える候補文字メモリ、４は文書パター
ンメモリ１１の内容、候補文字メモリ３中の候補文字お
よび確認部８からの修正確認のメッセージなどを表示す
る表示部、５は文字認識部２が出力する特徴量を記憶す
る特徴量メモリ、６は操作者が修正情報を入力する修正
情報入力部、７は確認部８で操作者が修正を了承した場
合にマッチング部１０を制御するとともに候補文字メモ
リ３の内容を修正する制御部、９は文字認識部２が出力
するリジェクト情報を蓄えるリジェクト情報メモリ、１
０は特徴量メモリ５中の特徴量を互いに照合するマッチ
ング部、１１は文字入力部１からの２値データを蓄える
文書パターンメモリ、１２は文字認識部２から出力され
た文字のうち、リジェクトされた文字に関する情報だけ
をリジェクト情報メモリ９と特徴量メモリ５に書き込む
ためのＡＮＤゲートである。なお、実施例では候補文字
は文字コード、リジェクト情報はリジェクトコードおよ
び特徴量などで記憶するが、他の文字フォントなどの手
段でもよい。FIG. 1 is a block diagram showing the configuration of a character recognition apparatus using a method for correcting recognized characters according to an embodiment of the present invention.
In the figure, 1 is an image input unit that photoelectrically converts a document and outputs binarized data to the document pattern memory 11. Reference numeral 2 is preprocessing, feature extraction, and a character code and a reject code from the feature amount for the binarized data. A character recognition unit 3 which performs character recognition processing such as matching to output character pattern coordinates, feature amounts, candidate characters, and reject information in the document pattern memory 11, and 3 is a candidate for storing the candidate characters recognized by the character recognition unit 2. A character memory, 4 is a display unit for displaying the contents of the document pattern memory 11, candidate characters in the candidate character memory 3, a message for confirmation of correction from the confirmation unit 8, and the like, and 5 is a feature amount stored by the character recognition unit A feature amount memory, 6 is a correction information input section for the operator to input correction information, and 7 is a confirmation section 8 which is a matching section when the operator approves the correction. Control unit for modifying the contents of the candidate character memory 3 to control the 0, the reject information memory for storing a reject information output by the character recognition unit 2 9, 1
0 is a matching unit for collating the feature amounts in the feature amount memory 5 with each other, 11 is a document pattern memory for storing the binary data from the character input unit 1, and 12 is a character output from the character recognition unit 2 and is rejected. It is an AND gate for writing only the information regarding the characters written in the reject information memory 9 and the feature amount memory 5. Although the candidate character is stored as a character code and the reject information is stored as a reject code and a feature amount in the embodiment, other means such as a character font may be used.

【００１０】図２は本発明の一実施例の認識文字の修正
方法を用いた文字認識装置の記憶手段と、その周辺装置
の構成をブロック図で示す。図において、２１は文書を
読み取り、ビットデータに変換して出力するイメージス
キャナ、２２はＲＡＭであり、イメージスキャナ２１が
出力するビットデータを記憶する文書パターンメモリ１
１と、認識文字を記憶する候補文字メモリ３と、候補文
字メモリ３内の候補文字のうち誤認識の可能性の高い文
字の特徴量を記憶する特徴量メモリ５と、リジェクトに
関する情報を記憶するリジェクト情報メモリ９と、処理
に使用するレジスタ領域２７と、操作者からの誤認識文
字に対する修正情報を記憶する修正情報領域２８とを備
えている。２３はＲＯＭであり、特徴量とそれに対応す
る文字コードとを記憶した辞書領域３０と、修正動作を
制御するプログラムを記憶したプログラム記憶領域３１
とを備えている。２４はプログラム記憶領域３１に記憶
された制御プログラムに従って処理を行なう処理回路、
２５はデータを入力するキーボート、４は表示部であ
る。FIG. 2 is a block diagram showing the configuration of the storage means of the character recognition apparatus using the method for correcting the recognized characters according to the embodiment of the present invention and the peripheral devices thereof. In the figure, reference numeral 21 is an image scanner for reading a document, converting it into bit data and outputting it, and 22 is a RAM, which is a document pattern memory 1 for storing bit data output by the image scanner 21.
1, a candidate character memory 3 for storing a recognized character, a feature amount memory 5 for storing a feature amount of a character having a high possibility of being erroneously recognized among the candidate characters in the candidate character memory 3, and information about a reject. The reject information memory 9 includes a register area 27 used for processing, and a correction information area 28 for storing correction information for an erroneously recognized character from the operator. Reference numeral 23 denotes a ROM, which is a dictionary area 30 that stores a feature amount and a character code corresponding to the ROM, and a program storage area 31 that stores a program for controlling the correction operation.
It has and. Reference numeral 24 denotes a processing circuit that performs processing in accordance with the control program stored in the program storage area 31,
Reference numeral 25 is a keyboard for inputting data, and 4 is a display unit.

【００１１】以下、上記構成要素の相互関係と動作につ
いて図面を参照しながら説明する。まず、修正のための
先行処理について説明する。図３は修正のための先行処
理動作をフローチャートで示す。ステーップｓ１００で
動作の順番をカウントするカウンタのｉおよびｊをそれ
ぞれ１に設定して初期化する。なお、ｉは認識処理する
文字の順序であり、ｊは特徴量メモリ５およびリジェク
ト情報メモリ９の格納順序である。処理はｉ＝１すなわ
ち候補文字列の１番目の文字から開始する。ステップｓ
１０１で画像入力部１から入力した文書の２値化データ
にノイズ除去、文字切り出しなどの前処理を行なってス
テップｓ１０２に移行する。ステップｓ１０２で認識処
理すべきｉ番目の文字パターンが存在しないときはステ
ップｓ１１０に移行し、存在するときはステップｓ１０
３に移行して、文字認識処理によりその文字パターンの
特徴量ｅ［ｉ］とそれに対応する候補文字を抽出し、ス
テップｓ１０４に移行する。ステップｓ１０４では候補
文字のコードａ［ｉ］を候補文字メモリ３に格納し、ス
テップ１０５に移行して認識された文字パターンのリジ
ェクトコードを基に誤認識の可能性を判定し、高い場合
にはリジェクトすべき文字パターンとしてステップｓ１
０６に移行し、リジェクトしない場合にはステップｓ１
０９に移行する。なお、ステップｓ１０５における誤認
識可能性判定に、あらかじめ定めた特定の文字パターン
を誤認識しやすいものとする手段を用いてもよい。The mutual relationship and operation of the above-mentioned components will be described below with reference to the drawings. First, the preceding process for correction will be described. FIG. 3 is a flowchart showing the preceding processing operation for correction. At step s100, i and j of counters for counting the order of operations are set to 1 and initialized. Note that i is the order of characters to be recognized, and j is the storage order of the feature amount memory 5 and the reject information memory 9. The process starts from i = 1, that is, the first character of the candidate character string. Step s
In step 101, the binary data of the document input from the image input unit 1 is subjected to preprocessing such as noise removal and character segmentation, and the process proceeds to step s102. If the i-th character pattern to be recognized in step s102 does not exist, the process proceeds to step s110, and if it exists, step s10.
In step 3, the character recognition process extracts the characteristic amount e [i] of the character pattern and the candidate character corresponding to the characteristic amount, and then proceeds to step s104. In step s104, the candidate character code a [i] is stored in the candidate character memory 3, the process proceeds to step 105, and the possibility of erroneous recognition is determined based on the reject code of the recognized character pattern. Step s1 as a character pattern to be rejected
If the process proceeds to 06 and does not reject, step s1
Move to 09. It should be noted that the erroneous recognition possibility determination in step s105 may be performed by using a unit that makes it easy to erroneously recognize a predetermined specific character pattern.

【００１２】ステップｓ１０６ではｉ番目の文字パター
ンがｊ個目にリジェクトする文字パターンであるとき、
その特徴量ｅ［ｉ］を特徴量メモリ５に記憶し、ステッ
プｓ１０７で、候補文字メモリ３にａ［ｉ］を格納した
アドレス、特徴メモリ５にｅ［ｊ］を格納したアドレ
ス、および文書パターンメモリ１１における文字パター
ンの座標とをそれぞれリジェクト情報メモリ９のｂ
［ｊ］、ｃ［ｊ］およびｄ［ｊ］領域に記憶する。つぎ
に、ステップｓ１０８に移行してｊに１を加算してステ
ップｓ１０９に移行し、ｉの値に１を加算してステップ
１０２に戻り、つぎのｉ＋１番目の文字パターンの認識
処理に移行する。以上の処理を最後の文字パターンまで
順次繰り返すことにより、誤認識される可能性の高い文
字パターンの情報が、その数だけリジェクト情報メモリ
９に格納される。すべての文字パターンについて処理を
終了するとステップｓ１１０に移行し、リジェクト情報
の最後を示すために、ｂ［ｊ］に０を記憶させる。な
お、後述の修正処理過程で修正済みとした候補文字につ
いてはｂ［ｊ］＝１に設定するものとし、したがって、
候補文字の候補文字メモリ３におけるアドレスは０と１
以外のアドレスを有するものとする。なお、表示部４は
文書パターンメモリ１１の内容と、候補文字メモリ３内
の文字コードまたはこれに対応する文字フォントを表示
する。In step s106, when the i-th character pattern is the j-th character pattern to be rejected,
The feature amount e [i] is stored in the feature amount memory 5, and in step s107, the address storing a [i] in the candidate character memory 3, the address storing e [j] in the feature memory 5, and the document pattern. The coordinates of the character pattern in the memory 11 and b of the reject information memory 9 are
Store in the [j], c [j], and d [j] areas. Next, the process proceeds to step s108, 1 is added to j, the process proceeds to step s109, 1 is added to the value of i, the process returns to step 102, and the process for recognizing the next (i + 1) th character pattern is performed. By sequentially repeating the above processing until the last character pattern, the number of pieces of character pattern information that are likely to be erroneously recognized are stored in the reject information memory 9. When the processing is completed for all the character patterns, the process proceeds to step s110, and 0 is stored in b [j] to indicate the end of the reject information. Note that b [j] = 1 is set for candidate characters that have been corrected in the correction process described below, and therefore,
The addresses of the candidate characters in the candidate character memory 3 are 0 and 1.
Have an address other than. The display unit 4 displays the content of the document pattern memory 11 and the character code in the candidate character memory 3 or the character font corresponding thereto.

【００１３】以上の先行処理を終了すると、つぎの修正
処理に移行する。以下、修正処理について図面を参照し
ながら説明する。図４は修正処理の動作をフローチャー
トで示す。修正処理はリジェクト情報メモリ９の情報を
もとに、その格納順序ｊに従って実行する。まず、ステ
ップｓ１０でカウンタのｊを１を設定して初期化する。
ステップｓ１１でリジェクト情報メモリ９に記憶された
ｂ［ｊ］の内容を参照し、ｂ［ｊ］＝０の場合は修正対
象の候補文字が残っていないので処理を終了する。ｂ
［ｊ］≠０の場合にはステップｓ１２に移行し、ｂ
［ｊ］が与えるアドレスで候補メモリ３に記憶されてい
る候補文字がすでに修正済みの文字か否かを判断する。
すなわち、前述のようにｂ［ｊ］＝１であれば修正済み
の文字とするのでステップｓ２５に移行し、［ｊ］≠１
であれば未修正なのでステップｓ１３に移行して、操作
者にリジェクト情報メモリ９のｄ［ｊ］が与える座標の
文字パターンを表示部４に表示し、ステップ１４に移行
して操作者の修正判断を求める。表示された文字を別の
文字に修正する場合にはステップｓ１５に移行し、修正
しない場合にはステップｓ２５に移行する。修正する場
合、ステップｓ１５に移行し、ｂ［ｊ］与えるアドレス
で候補メモリ３に格納されている候補文字を操作者が入
力した文字に置き換え、以下に説明するステップ１６以
降の処理により、いま修正された元の候補文字と類似し
た候補文字を未処理の候補文字の中から探し出し、操作
者の確認で修正処理を行う。ステップｓ１４で修正しな
い場合、ステップｓ２５に移行してｊに１を加算し、ス
テップｓ１１に戻ってつぎのｊ＋１番目のリジェクト情
報の処理に移行する。Upon completion of the preceding processing, the next correction processing is started. The correction process will be described below with reference to the drawings. FIG. 4 is a flowchart showing the operation of the correction process. The correction process is executed according to the storage order j based on the information in the reject information memory 9. First, in step s10, the counter j is set to 1 and initialized.
In step s11, the contents of b [j] stored in the reject information memory 9 are referred to. If b [j] = 0, there is no candidate character to be modified, so the process ends. b
If [j] ≠ 0, the process proceeds to step s12, and b
It is determined whether the candidate character stored in the candidate memory 3 at the address given by [j] has already been corrected.
That is, as described above, if b [j] = 1, it is determined as a corrected character, so the process proceeds to step s25 and [j] ≠ 1.
If so, the process moves to step s13, the character pattern of the coordinates given by d [j] of the reject information memory 9 to the operator is displayed on the display unit 4, and the process proceeds to step 14 to judge the operator's correction. Ask for. If the displayed character is to be modified to another character, the process proceeds to step s15, and if not, the process proceeds to step s25. In the case of correction, the process proceeds to step s15, the candidate character stored in the candidate memory 3 at the address given by b [j] is replaced with the character input by the operator, and the correction is now performed by the processing of step 16 and subsequent steps described below. A candidate character similar to the original candidate character thus obtained is searched from unprocessed candidate characters, and correction processing is performed by confirmation of the operator. If not corrected in step s14, the process proceeds to step s25, 1 is added to j, and the process returns to step s11 to proceed to the next j + 1-th reject information process.

【００１４】前記ステップｓ１５でアドレスｂ［ｊ］の
候補文字を修正した場合、ｂ［ｊ］に続くアドレスｂ
［ｊ＋１」、ｂ［ｊ＋２」、・・・の未修正候補文字の
中で、アドレスｂ［ｊ］の文字に類似した候補文字を類
似演算により検索し、操作者の確認で修正する処理に移
行する。そのために、まずステップｓ１６でカウンタｋ
にｊ＋１を代入し、ステップｓ１９でｋ（＝ｊ＋１）の
アドレスで特徴量メモリ５に記憶している特徴量ｅ
［ｋ］をいま修正された元の文字パターンの特徴量ｅ
［ｊ］と比較演算する。その前にステップｓ１７でリジ
ェクト情報メモリ９にｋに対応するリジェクト情報が残
っているかを調べ、ステップｓ１８でｋに対応する文字
が修正済みであるかを調べ、残っていない場合はステッ
プｓ２５に移行し、修正済みであればステップｓ２３に
移行する。When the candidate character of the address b [j] is corrected in step s15, the address b following b [j]
Of the uncorrected candidate characters [j + 1], b [j + 2], ..., Candidate characters similar to the character at the address b [j] are searched for by a similar operation, and the process moves to a process for correction by confirmation by the operator. To do. Therefore, first, in step s16, the counter k
Substituting j + 1 for the feature quantity e stored in the feature quantity memory 5 at the address k (= j + 1) in step s19.
The feature amount e of the original character pattern in which [k] is now corrected
[J] is compared and calculated. Before that, it is checked in step s17 whether the reject information corresponding to k remains in the reject information memory 9, and in step s18 it is checked whether the character corresponding to k has been corrected. If not, the process proceeds to step s25. If it has been corrected, the process proceeds to step s23.

【００１５】ステップｓ１９でリジェクト情報メモリ９
内のｂ［ｊ］とｂ［ｋ］が与えるアドレスで特徴量メモ
リ５に格納されているｊ番目の文字の特徴量ｅ［ｊ］と
ｋ番目の文字の特徴量ｅ［ｋ］との類似度を計算し、類
似度があらかじめ定めたしきい値より大きいが否かを判
断する。大きい場合にはステップｓ２０に移行して操作
者にその候補文字を提示し、修正したいときはステップ
ｓ２２で修正し、ステップｓ２４で修正済みとしてアド
レスｂ［ｋ］に１を設定し、ステップｓ２３でｋに１を
加算してつぎのアドレスｂ［ｊ＋２］の文字の特徴量を
アドレスｂ［ｊ］の特徴量と比較する。このように、修
正したアドレスｂ［ｊ］の文字に対してアドレスｂ［ｊ
＋１］、ｂ［ｊ＋２］、・・・の文字について類似比較
し、類似したものを表示して操作者の確認で修正する。In step s19, the reject information memory 9
Of the feature quantity e [j] of the jth character and the feature quantity e [k] of the kth character stored in the feature quantity memory 5 at the address given by b [j] and b [k] in The degree is calculated, and it is determined whether the degree of similarity is larger than a predetermined threshold value. If it is larger, the process moves to step s20, the candidate character is presented to the operator, and if it is desired to correct it, it is corrected in step s22, 1 is set in the address b [k] as corrected in step s24, and in step s23. 1 is added to k, and the characteristic amount of the character at the next address b [j + 2] is compared with the characteristic amount at the address b [j]. In this way, for the corrected character at the address b [j], the address b [j]
Characters such as +1], b [j + 2], ... Are compared for similarity, and similar characters are displayed and corrected by the operator's confirmation.

【００１６】以下、７文字の文字列「あかいあさがお」
を例に、その動作を具体的に説明する。以下の説明にお
いて、文字パターン”あ”と”お”はその特徴量が類似
していることにより誤認識の可能性が高いパターンであ
るとし、「おかいおさがお」と誤認識されるものとして
説明する。Below, the 7-character string "Akaasao"
The operation will be specifically described with reference to FIG. In the following description, it is assumed that the character patterns “a” and “o” are patterns that are likely to be erroneously recognized because their feature amounts are similar, and are erroneously recognized as “okai osasa”. I will explain as things.

【００１７】画像入力部１で、認識対象文字列パターン
を２値画像データとして文書パターンメモリ１１に記憶
する。つぎに、ステップｓ１０１で文書パターンメモリ
１１に記憶された画像にノイズ除去、文字切り出しなど
の前処理を実行し、切り出した文字パターンに対して１
番目の文字パターンから順次ステップｓ１０２以降の先
行処理を実行する。その結果、候補文字メモリ３に候補
文字が格納される。（表１）は候補文字メモリ３の構成
を表で示す。The image input unit 1 stores the character string pattern to be recognized as binary image data in the document pattern memory 11. Next, in step s101, preprocessing such as noise removal and character cutout is performed on the image stored in the document pattern memory 11, and 1 is applied to the cut out character pattern.
The preceding process after step s102 is sequentially executed from the th character pattern. As a result, the candidate character is stored in the candidate character memory 3. Table 1 shows the configuration of the candidate character memory 3 in a table.

【００１８】[0018]

【表１】 [Table 1]

【００１９】（表１）において番号ｉは認識処理する文
字列の文字順序であり、ａ［ｉ］は候補文字の内容、す
なわち候補文字のコードである。実施例の場合、文字パ
ターン”あ”が”お”に誤認識されて格納されている。In Table 1, the number i is the character sequence of the character string to be recognized, and a [i] is the content of the candidate character, that is, the code of the candidate character. In the case of the embodiment, the character pattern "a" is erroneously recognized as "o" and stored.

【００２０】また、誤認識される可能性の高い文字パタ
ーンについては、ステップｓ１０４により特徴量メモリ
５に特徴量が格納される。（表２）は特徴量メモリ５の
構成を表で示す。For a character pattern which is likely to be erroneously recognized, the characteristic amount is stored in the characteristic amount memory 5 in step s104. Table 2 shows the configuration of the feature amount memory 5 in a table.

【００２１】[0021]

【表２】 [Table 2]

【００２２】（表２）において、番号ｊは誤認識の可能
性が高い文字パターンとしてリジェクトしたｊ個目の文
字パターンを意味し、その特徴量ｅ［ｊ］がメモリのｊ
番目に記憶される。実施例の場合、文字パターン”あ”
を認識処理して抽出した特徴量が誤認識される可能性が
高いとして、検出した順にｅ［１］、ｅ［２］、ｅ
［３］として格納されている。In Table 2, the number j means the jth character pattern rejected as a character pattern with a high possibility of being erroneously recognized, and its feature amount e [j] is j in the memory.
Remembered th. In the case of the embodiment, the character pattern "a"
Is assumed to have a high possibility of being erroneously recognized, and e [1], e [2], e
It is stored as [3].

【００２３】また、（表３）はリジェクト情報メモリの
構成を表で示す。Further, (Table 3) is a table showing the structure of the reject information memory.

【００２４】[0024]

【表３】 [Table 3]

【００２５】（表３）において、番号ｊはリジェクトし
た文字パターンがｊ個目であることを意味し、リジェク
ト情報メモリのｊ番目にも対応する。＆ａ［ｊ］はリジ
ェクトした文字のコードａ［ｉ］の候補文字メモリ３に
おけるアドレス、＆ｃ［ｉ］はリジェクトした文字の特
徴量ｅ［ｊ］の特徴量メモリ５におけるアドレス、ｄ
［ｊ］はリジェクトした文字の文字パターンメモリ１１
における座標である。ｊ＝４では以降にリジェクト情報
が無いことを示すために、アドレス値０をｂ［４］に格
納している。In Table 3, the number j means that the rejected character pattern is the jth character pattern and also corresponds to the jth character in the reject information memory. & A [j] is the address of the rejected character code a [i] in the candidate character memory 3, & c [i] is the address of the rejected character's feature amount e [j] in the feature amount memory 5, d
[J] is the character pattern memory 11 of the rejected character
Is the coordinate at. When j = 4, the address value 0 is stored in b [4] to indicate that there is no reject information thereafter.

【００２６】修正処理は（表３）に示したリジェクト情
報メモリのｊ＝１から順に行なう。ｊ＝１の与えるアド
レスｂ［１］値（＝＆ａ［１］）で候補文字メモリ３に
格納している候補文字を表示すると、”お”が表示され
るので、操作者は”あ”に修正する。このとき、値ｃ
［１］（＝＆ｅ［１］）の与えるアドレスの特徴量と類
似する特徴量を有する候補文字をｊ＝２、ｊ＝３の中で
順に類似演算により検出して表示し、検出するたびに操
作者の判断で修正する。実施例の場合、文字列の１番目
の”あ”を”お”に修正したとき、４番目の”お”が表
示され、それを”あ”に修正し、つぎに”７番目の”
お”が表示され、これは修正しないで”お”のままとす
る。つぎに、ｊ＝４でアドレス値ｂ［４］が０であるの
でリジェクト情報がなく、修正処理を終了する。The correction process is performed in order from j = 1 of the reject information memory shown in (Table 3). When the candidate character stored in the candidate character memory 3 is displayed at the address b [1] value (= & a [1]) given by j = 1, "O" is displayed. Fix it. At this time, the value c
Each time a candidate character having a feature quantity similar to the feature quantity of an address given by [1] (= & e [1]) is detected and displayed by a similar operation in j = 2 and j = 3, each time it is detected, Correct at the operator's discretion. In the case of the embodiment, when the first "A" of the character string is corrected to "O", the fourth "O" is displayed, and it is corrected to "A" and then "7th".
“O” is displayed and remains “O” without correction. Next, since j = 4 and the address value b [4] is 0, there is no reject information and the correction process ends.

【００２７】以上のように本発明の実施例の認識文字の
修正方法によれば、誤認識の可能性の高い文字だけにつ
いて、その特徴量と修正に必要な情報を記憶することに
より、メモリ容量が小さくてすむとともに、類似度演算
も回数が少なくてすみ、従来に比べてコストパーフォマ
ンスの高い認識文字の修正処理が実現できる。As described above, according to the recognized character correction method of the embodiment of the present invention, the memory capacity is stored by storing the characteristic amount and the information necessary for the correction only for the character having a high possibility of erroneous recognition. Is small, and the number of times of similarity calculation is small, so that it is possible to realize recognition character correction processing with higher cost performance than in the past.

【００２８】なお、本実施例では文字認識を例に説明し
たが、音声認識の場合においても適用可能なことは言う
までもない。In the present embodiment, character recognition has been described as an example, but it goes without saying that it can be applied to the case of voice recognition.

【００２９】[0029]

【発明の効果】以上の実施例から明かなように、本発明
は画像入力手段により文字のパターンを読み取り、文字
認識手段によりそのパターンの特徴量を抽出するととも
に、前記特徴量から文字コードおよびリジェクトコード
を求め、前記文字コードを第１の記憶手段に記憶し、前
記リジェクトコードにより誤認識の可能性が高いと判別
した特定文字の特徴量および修正に要する情報とを第２
の記憶手段に記憶し、文字修正処理において、前記第２
の記憶手段の前記特定文字を表示手段により操作者に表
示し、操作者が表示された特定文字のうちの任意の第１
の文字を他の第２の文字に修正したとき、前記第２の記
憶手段中の文字であって前記第１の文字に所定値以上に
類似するものを類似演算により求めて表示し、その文字
を操作者が確認して前記第２の文字または他の文字に修
正するようにした認識文字の修正方法とすることによ
り、誤認識の可能性の高い文字だけに対して、原文字パ
ターンから抽出された特徴量と修正に必要な情報を記憶
するので、従来に比べてマッチング計算の対象文字数が
少なくなり、メモリ容量とマッチング計算の量が少なく
て済み、コストパーフォマンスの高い認識文字の修正処
理が実現できる。As is apparent from the above-described embodiments, the present invention reads a character pattern by the image input means, extracts the characteristic amount of the pattern by the character recognition means, and extracts the character code and the reject from the characteristic amount. A code is obtained, the character code is stored in the first storage means, and the characteristic amount of the specific character determined to have a high possibility of being erroneously recognized by the reject code and the information required for correction are secondly stored.
In the character correction process, the second
The specific character of the storage means of the above is displayed to the operator by the display means, and the operator selects any first of the displayed specific characters.
When the character of is modified to another second character, the character in the second storage means which is similar to the first character by a predetermined value or more is obtained and displayed by a similarity calculation, and the character is displayed. Is confirmed by the operator and corrected to the second character or another character, thereby extracting from the original character pattern only the character with a high possibility of being erroneously recognized. Since the stored feature amount and the information necessary for correction are stored, the number of target characters for matching calculation is smaller than in the past, the memory capacity and the amount of matching calculation are less, and the correction process of recognized characters with high cost performance is performed. realizable.

[Brief description of drawings]

【図１】本発明の一実施例の認識文字の修正方法を用い
た文字認識装置の構成を示すブロック図FIG. 1 is a block diagram showing a configuration of a character recognition device using a method for correcting a recognized character according to an embodiment of the present invention.

【図２】本発明の一実施例の認識文字の修正方法を用い
た文字認識装置における記憶手段とその周辺装置の構成
を示すブロック図FIG. 2 is a block diagram showing a configuration of a storage unit and its peripheral device in a character recognition device using a method for correcting a recognized character according to an embodiment of the present invention.

【図３】本発明の一実施例の認識文字の修正方法におけ
る先行処理の動作を示すフローチャートFIG. 3 is a flowchart showing an operation of a preceding process in the recognition character correction method according to the embodiment of the present invention.

【図４】本発明の一実施例の認識文字の修正方法におけ
る文字修正処理の動作を示すフローチャートFIG. 4 is a flowchart showing the operation of a character correction process in the recognized character correction method according to the embodiment of the present invention.

【図５】従来の認識文字の修正方法を用いた文字認識装
置の構成を示すブロック図FIG. 5 is a block diagram showing a configuration of a character recognition device using a conventional method for correcting recognized characters.

[Explanation of symbols]

１画像入力部（画像入力手段）２文字認識部（文字認識手段）３候補文字メモリ（第１の記憶手段）４表示部（表示手段）５特徴量メモリ（第２の記憶手段）９リジェクト情報メモリ（第２の記憶手段） DESCRIPTION OF SYMBOLS 1 image input section (image input means) 2 character recognition section (character recognition means) 3 candidate character memory (first storage means) 4 display section (display means) 5 feature amount memory (second storage means) 9 reject information Memory (second storage means)

Claims

[Claims]

1. A character pattern is read by an image input unit, a feature amount of the pattern is extracted by a character recognition unit, a character code and a reject code are obtained from the feature amount, and the character code is stored in a first storage unit. Stored in the second storage means, the characteristic amount of the specific character determined to have a high possibility of being erroneously recognized by the reject code and the information required for correction are stored in the second storage means, and in the character correction processing, the second storage means is stored. The specific character is displayed to the operator by the display means, and the operator corrects an arbitrary first character of the displayed specific characters to another second character,
Characters in the second storage means, which are similar to the first character with a predetermined value or more, are obtained by a similarity calculation and displayed, and the operator confirms the character to display the second character or other characters. How to correct the recognized character so that it will be corrected to the character.