JPH06290308A

JPH06290308A - Character recognizing device

Info

Publication number: JPH06290308A
Application number: JP5076601A
Authority: JP
Inventors: Makoto Kushima; 真久島; Koichi Higuchi; 浩一樋口
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1993-04-02
Filing date: 1993-04-02
Publication date: 1994-10-18
Anticipated expiration: 2016-10-22
Also published as: JP3221968B2

Abstract

PURPOSE:To provide a character recognizing device which can improve its operability by facilitating a selecting job to select a correct word among those candidate words shown at a display part. CONSTITUTION:A post-processing part 16 performs the knowledge processing based on the recognizing result of a recognizing part 12 and also outputs the candidate words. A display part 20 changes its display modes between a case where the words of the word information stored from the first in a knowledge dictionary are shown as the candidate words and a case where the words of the word information registered in a word registering part 14 are shown as the candidate words for an operator who is selecting a correct word among those candidate ones. Thus, the operator can easily distinguish the words of the word information stored in the knowledge dictionary from the words of the word information registered by users for selection of the candidate words. In such a constitution, the operator can instantaneously find out and corrects a candidate word having the high possibility of being a correct word and can confirm this candidate word.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】この発明は、迅速かつ正確に帳票
または文書を処理できるようにした操作性の良い文字認
識装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character recognizing device which is capable of processing a form or a document quickly and accurately and which has good operability.

【０００２】[0002]

【従来の技術】従来より、手書き文字の認識率を向上さ
せるために、知識辞書が用いられている。知識辞書が保
持する情報は、認識対象とする単語の情報、文脈情報、
およびその他の情報である。文字認識装置の利用者は新
たにこれらの情報を登録することも可能である。以下、
知識辞書の単語情報を利用した従来の文字認識技術とし
て、例えば文献１：昭和５７年電子通信学会総合全国大
会講演論文集分冊５ー３２６頁に開示されている技術
につき説明する。2. Description of the Related Art Conventionally, a knowledge dictionary has been used to improve the recognition rate of handwritten characters. The information held by the knowledge dictionary includes the information of the words to be recognized, context information,
And other information. The user of the character recognition device can also newly register these pieces of information. Less than,
As a conventional character recognition technique using the word information of the knowledge dictionary, for example, a technique disclosed in Document 1: Showa 57 1982, General Conference of the Institute of Electronics and Communication, Volume 5 to 326, will be described.

【０００３】まず帳票または文書の所定領域を光学的に
走査し紙面からの光信号を光電変換して帳票または文書
の画像データを得る。そして画像データから認識対象と
なる文字のパタンを切り出す。この切り出した文字パタ
ンに基づき認識対象となる文字の認識を行ない、認識結
果として一つまたは複数個の候補文字を得る。そして知
識処理一単位分の文字に関して、各文字毎に得た１つま
たは複数個の候補文字を組み合わせて文字列を作り、文
字列の各候補文字毎に付された候補順位または類似度を
用いて文字列および単語情報の単語の間の類似度を算出
し、この類似度が最大となる文字列を選択する。そして
この文字列に対応する単語情報の単語を表示する。但
し、当該文字列の類似度が所定の閾値以下となる場合に
は、候補順位が１位となる候補文字を組み合わせてでき
る文字列を表示する。First, a predetermined area of a form or a document is optically scanned and an optical signal from the paper surface is photoelectrically converted to obtain image data of the form or the document. Then, a pattern of characters to be recognized is cut out from the image data. A character to be recognized is recognized based on the cut-out character pattern, and one or a plurality of candidate characters are obtained as a recognition result. Then, for a character for one unit of knowledge processing, a character string is created by combining one or more candidate characters obtained for each character, and the candidate rank or similarity assigned to each candidate character of the character string is used. Then, the similarity between the character string and the word of the word information is calculated, and the character string having the maximum similarity is selected. Then, the word of the word information corresponding to this character string is displayed. However, when the similarity of the character string is less than or equal to a predetermined threshold value, a character string formed by combining the candidate characters having the first candidate rank is displayed.

【０００４】知識処理結果を訂正する際には、当該文字
列との類似度が所定の閾値以上になった単語情報の単語
群を候補単語として表示し、その中に正しい単語が含ま
れていればそれを選択することにより訂正を行なう。When correcting the knowledge processing result, a word group of word information whose similarity to the character string exceeds a predetermined threshold value is displayed as a candidate word, and the correct word is included in it. For example, select it to make the correction.

【０００５】[0005]

【発明が解決しようとする課題】上記文字認識技術で
は、候補単語を表示する際に、（１）最初から存在した
単語情報の単語が候補単語として出力される場合と、
（２）利用者が登録した単語情報の単語が候補単語とし
て出力される場合が考えられるが、利用者は目的に応じ
て必要性の高い単語情報を登録するのであるから（２）
の場合の単語は（１）の場合の単語より正解である可能
性が高い。それにもかかわらず（１）、（２）のケース
の単語の表示を同一としていた。従って、オペレーター
は候補単語の表示を一瞥しただけでは正解である可能性
が高い単語を発見することはできないので、表示されて
いる全ての単語を注意深く観察して正解単語を選択する
必要があり、このため訂正、確認等の編集作業に多くの
時間を要し、単位時間当たりに処理できる帳票または文
書の枚数が少なくなるという問題があった。In the above character recognition technique, when displaying a candidate word, (1) a word of word information existing from the beginning is output as a candidate word;
(2) A word of the word information registered by the user may be output as a candidate word, but the user registers word information that is highly necessary according to the purpose (2)
The word in case (1) is more likely to be the correct answer than the word in case (1). Nevertheless, the display of the words in the cases (1) and (2) was the same. Therefore, the operator can not find a word that is likely to be the correct answer by simply glancing at the display of candidate words, so it is necessary to carefully observe all the displayed words and select the correct answer word, Therefore, it takes a lot of time for editing work such as correction and confirmation, and there is a problem that the number of forms or documents that can be processed per unit time becomes small.

【０００６】この発明の目的は上述した従来の問題点を
解決し、表示された候補単語の中から正解単語を選択す
る作業を従来よりも行ない易くして操作性を改良した文
字認識装置を提供することにある。An object of the present invention is to solve the above-mentioned conventional problems, and to provide a character recognition apparatus with improved operability by making it easier to perform a work of selecting a correct answer word from displayed candidate words as compared with the conventional art. To do.

【０００７】[0007]

【課題を解決するための手段】本発明は前記課題を解決
するために、量子化された帳票または文書の画像データ
から切り出した文字パタンの認識結果を出力する認識部
と、単語情報を知識辞書へ登録するための単語登録部
と、前記知識辞書を用いて前記認識結果に基づく知識処
理結果及び候補単語を出力する後処理部と、前記後処理
結果を編集する結果編集部と、前記候補単語を表示する
表示部と、正解単語を入力する入力部を備えて成る文字
認識装置において、前記表示部を、当該候補単語が当該
知識辞書に最初から存在した単語情報の単語である場合
と、前記単語登録部で登録された単語情報の単語である
場合とでその単語の表示方法を変化させるようにしたこ
とを特徴とする。In order to solve the above problems, the present invention provides a recognition unit for outputting a recognition result of a character pattern cut out from image data of a quantized form or document, and a word dictionary for word information. A word registration unit for registering to, a post-processing unit that outputs a knowledge processing result and a candidate word based on the recognition result using the knowledge dictionary, a result editing unit that edits the post-processing result, and the candidate word In a character recognition device comprising a display unit for displaying, and an input unit for inputting a correct word, in the display unit, when the candidate word is a word of word information that originally existed in the knowledge dictionary, It is characterized in that the display method of the word is changed depending on whether the word is the word of the word information registered by the word registration unit.

【０００８】[0008]

【作用】この発明によれば、候補単語から正解を選択す
る際の表示部の表示は、知識辞書に最初から存在した単
語情報の単語が候補単語として出力された場合と、単語
登録部で登録された単語情報の単語が候補単語として出
力された場合とで変化し、候補単語を選択する作業にお
いて最初から知識辞書に存在した単語情報の単語と、利
用者が登録した単語情報の単語とを簡単に見分けること
ができるので、オペレーターは正解の可能性の高い候補
単語を瞬時に発見し訂正、確認等の作業を行える。従っ
て前記課題が解決されるのである。According to the present invention, when the correct answer is selected from the candidate words, the display on the display unit is performed when the word of the word information existing from the beginning in the knowledge dictionary is output as the candidate word and when it is registered by the word registration unit. When the word of the selected word information is output as a candidate word, the word of the word information existing in the knowledge dictionary from the beginning in the operation of selecting the candidate word and the word of the word information registered by the user are changed. Since they can be easily identified, the operator can instantly find a candidate word with a high possibility of being correct and correct or confirm it. Therefore, the above problem is solved.

【０００９】[0009]

【実施例】以下、図面を参照しこの発明の実施例につき
説明する。尚、図面はこの発明が理解できる程度に概略
的に示されているにすぎず、従って各構成成分の形状、
配設位置、寸法、入出力信号および接続関係を図示例に
限定するものではない。Embodiments of the present invention will be described below with reference to the drawings. It should be noted that the drawings are only schematically shown to the extent that the present invention can be understood.
The arrangement position, dimensions, input / output signals, and connection relationship are not limited to the illustrated example.

【００１０】図１はこの発明の一実施例の説明に供する
機能ブロック図である。この実施例の文字認識装置１０
は、量子化された帳票または文書の画像データから文字
パタンを切り出し、この切り出した文字パタンの認識結
果を出力する認識部１２と、文字認識装置の利用者が希
望する単語情報を登録する単語登録部１４と、知識辞書
に基づき認識結果の知識処理結果及び候補単語を出力す
る後処理部１６と、知識処理結果を訂正、確認する結果
編集部１８と、知識処理結果及び候補単語を表示する表
示部２０と、正解単語を入力する入力部２２を備え、さ
らにこれら単語登録部１４、結果編集部１８、表示部２
０、及び入力部２２の動作を制御する制御部２４を備え
て成る。また図１において２６は帳票または文書の量子
化された画像データを出力する光電変換部であり、２８
は光電変換部２６からの画像データを格納する画像メモ
リである。FIG. 1 is a functional block diagram for explaining one embodiment of the present invention. Character recognition device 10 of this embodiment
Is a recognition unit 12 that cuts out character patterns from image data of a quantized form or document and outputs a recognition result of the cut-out character patterns, and word registration that registers word information desired by a user of the character recognition device. Unit 14, post-processing unit 16 for outputting knowledge processing result of recognition result and candidate word based on knowledge dictionary, result editing unit 18 for correcting and confirming knowledge processing result, display for displaying knowledge processing result and candidate word A unit 20 and an input unit 22 for inputting a correct word are provided, and these word registration unit 14, result editing unit 18, and display unit 2 are further provided.
0, and a control unit 24 for controlling the operation of the input unit 22. Further, in FIG. 1, reference numeral 26 denotes a photoelectric conversion unit that outputs quantized image data of a form or a document.
Is an image memory for storing the image data from the photoelectric conversion unit 26.

【００１１】図２は帳票の一例を示す図であり、同図に
おいて３０は住所が記載される帳票の例、及び３２は文
字記載領域を指定する記入枠である。FIG. 2 is a diagram showing an example of a form. In FIG. 2, 30 is an example of a form in which an address is written, and 32 is an entry frame for designating a character writing area.

【００１２】図３は候補単語の選択の一例を示す図であ
り、同図において３４は表示画面、３６は帳票、３８は
訂正する文字列領域、４０はカーソル位置、４２は候補
単語表示枠、４４は表示方法が変化している利用者登録
単語を示す。FIG. 3 is a diagram showing an example of selection of candidate words. In FIG. 3, 34 is a display screen, 36 is a form, 38 is a character string area to be corrected, 40 is a cursor position, 42 is a candidate word display frame, Reference numeral 44 indicates a user registration word whose display method is changed.

【００１３】以下、図１、図２、及び図３を参照し、本
実施例につきより詳細に説明する。光電変換部２６は帳
票または文書上の所定の読取り範囲を光学的に走査し、
帳票または文書からの光信号Ｌを光電変換して白黒２値
に量子化された画像データを出力し、画像メモリ２８は
この画像データを格納する。Hereinafter, this embodiment will be described in more detail with reference to FIGS. 1, 2, and 3. The photoelectric conversion unit 26 optically scans a predetermined reading range on a form or a document,
The optical signal L from the form or document is photoelectrically converted to output image data quantized into black and white binary, and the image memory 28 stores this image data.

【００１４】認識部１２は画像メモリ２８の画像データ
から文字パタンを切り出し、この切り出した文字パタン
から認識対象となる文字に関する各種特徴を抽出する。
そして切り出した文字パタンの特徴を標準文字パタンの
特徴と照合し、文字認識結果及び候補順位を出力する。
ひとつの文字に関して１個または複数個の候補文字が認
識結果として得られ、候補文字が１個の場合には候補順
位１を当該候補文字に付して出力し、また候補文字が複
数個の場合には各候補文字毎に定めた候補順位を候補文
字に付して出力する。The recognition unit 12 cuts out a character pattern from the image data in the image memory 28, and extracts various features relating to the character to be recognized from the cut-out character pattern.
Then, the characteristics of the extracted character pattern are compared with the characteristics of the standard character pattern, and the character recognition result and the candidate rank are output.
When one or more candidate characters are obtained as a recognition result for one character, and when there is one candidate character, candidate rank 1 is attached to the candidate character and output, and when there are multiple candidate characters , The candidate rank determined for each candidate character is attached to the candidate character and output.

【００１５】単語登録部１４は文字認識装置の利用者が
希望する単語情報を知識辞書へ追加登録する。知識辞書
の単語情報には単語登録部で登録された単語情報と、最
初から用意されている一般の単語情報とを区別できるよ
うな情報を付加しておく。以後この情報を単語登録情報
と呼ぶ。The word registration unit 14 additionally registers the word information desired by the user of the character recognition device in the knowledge dictionary. Information is added to the word information of the knowledge dictionary so that the word information registered by the word registration unit and the general word information prepared from the beginning can be distinguished. Hereinafter, this information will be referred to as word registration information.

【００１６】後処理部１６は認識部１２からの認識結果
に基づき単語情報を用いた知識処理を行う。後処理部１
６は知識処理一単位文の文字の認識結果（例えば図２に
示す帳票３０において都道府県名の記載領域の認識結
果）を入力すると、知識処理一単位分の各文字の候補文
字を組み合わせてできる文字列を単語情報の単語と照合
し、候補文字から成る文字列に対応する単語が単語情報
の中に存在するか否か調べる。そして組み合わせてでき
た文字列の中から単語情報の単語と合致する文字列Ａを
検出したら、文字列Ａの評価値Ｊを算出する。ここでＳ
は文字列の各候補文字に付された候補順位の和及びＮは
文字列を構成する文字の総個数を示すものとすれば、評
価値Ｊを例えば、Ｊ＝Ｓ÷Ｎと表わすことができる。The post-processing unit 16 performs knowledge processing using word information based on the recognition result from the recognition unit 12. Post-processing unit 1
6 is a combination of the candidate characters of each character for one unit of knowledge processing when the recognition result of the character of one unit of knowledge processing (for example, the recognition result of the area where the prefecture name is described in the form 30 shown in FIG. 2) is input. The character string is collated with the word of the word information, and it is checked whether or not the word corresponding to the character string of the candidate characters exists in the word information. Then, when the character string A that matches the word of the word information is detected from the character strings formed by combining, the evaluation value J of the character string A is calculated. Where S
The evaluation value J can be expressed as, for example, J = S ÷ N, where is the sum of the candidate ranks given to each candidate character of the character string and N is the total number of characters that make up the character string. .

【００１７】単語及び文字列Ａが合致するか否かの判定
は、例えば、単語及び文字列Ａの対応する位置の文字の
文字コードが全部一致するか否かによって行なう。そし
て知識処理一単位分についてできた文字列の全てを単語
情報と照合し終えたときに文字列Ａの中から評価値Ｊが
最小となる文字列Ａを知識処理結果として選択する。ま
た所定の評価値以下の単語情報を文字列Ａの候補単語と
して選択する。この文字列Ａを候補単語と共に結果編集
部１８へ送出する。Whether or not the word and the character string A match is determined by, for example, whether or not all the character codes of the characters at the corresponding positions of the word and the character string A match. Then, the character string A having the smallest evaluation value J is selected from the character strings A as the knowledge processing result when all the character strings formed for one unit of knowledge processing have been matched with the word information. Also, word information that is less than or equal to a predetermined evaluation value is selected as a candidate word of the character string A. This character string A is sent to the result editing section 18 together with the candidate word.

【００１８】また知識処理一単位文の文字列全てを単語
情報の単語と照合し終えたときに文字列Ａを１個だけ検
出していたら、当該文字列Ａを知識処理結果として選択
し、選択した文字列Ａを結果編集部１８へ送出する。If only one character string A is detected when all the character strings of the knowledge processing unit text are matched with the words of the word information, the character string A is selected as the knowledge processing result and selected. The resulting character string A is sent to the result editing unit 18.

【００１９】また知識処理一単位分の文字列全てを単語
情報の単語と照合し終えたときに文字列Ａを１個も検出
していなければ、知識処理一単位分の各文字の候補順位
が１位の候補文字を組み合わせてできる文字列Ａを知識
処理結果として選択し、選択した文字列Ａを結果編集部
１８へ送出する。If no character string A is detected when all the character strings for one unit of knowledge processing have been matched with the words of the word information, the candidate rank of each character for one unit of knowledge processing is The character string A formed by combining the first-ranked candidate characters is selected as the knowledge processing result, and the selected character string A is sent to the result editing unit 18.

【００２０】結果編集部１８は後処理部１６から送出さ
れた文字列Ａの候補単語が存在する場合は、文字列Ａ及
び候補単語を制御部２４へ送出する。そして制御部２４
は文字列Ａ、候補単語及び単語登録情報に応じた候補単
語の表示指示を表示部２０へ送出する。このとき当該候
補単語が単語登録部で登録された単語情報の単語でない
場合は、第一の色で表示する指示を、単語登録部で登録
された単語情報の単語である場合は第一の色と異なる第
二の色で表示する指示を送出する。さらに表示部２０は
表示指示で指定された方法で文字列Ａ及び候補単語を表
示する（例えば図３に示す表示画面３４）。ここで、オ
ペレータは表示画面中の文字列Ａが誤りか否かを判断
し、誤りであれば候補単語の中から正解単語を選択す
る。もし候補単語中に正解単語が存在しない場合は、キ
ーボード等により正解単語を入力する。When the candidate word of the character string A sent from the post-processing unit 16 exists, the result editing unit 18 sends the character string A and the candidate word to the control unit 24. And the control unit 24
Sends a display instruction of the candidate word according to the character string A, the candidate word, and the word registration information to the display unit 20. At this time, if the candidate word is not the word of the word information registered in the word registration unit, the instruction to display in the first color is given, and if it is the word of the word information registered in the word registration unit, the first color is displayed. Sends an instruction to display in a second color different from. Further, the display unit 20 displays the character string A and the candidate word by the method designated by the display instruction (for example, the display screen 34 shown in FIG. 3). Here, the operator determines whether or not the character string A in the display screen is incorrect, and if it is incorrect, the operator selects the correct answer word from the candidate words. If the correct word does not exist in the candidate words, enter the correct word using a keyboard or the like.

【００２１】また結果編集部１８は文字列Ａの候補単語
が存在しない場合は、文字列Ａを制御部２４を介して表
示部２０へ送出し、表示部２０はこの文字列Ａを表示す
る。ここでオペレータは表示画面中の文字列Ａが誤りか
否かを判断し、誤りであればキーボード等により正解単
語を入力する。そして結果編集部１８は入力部２２及び
制御部２４を介して上述の正解単語を受取り、誤った文
字列と交換する。また、表示部２０に表示された文字列
Ａがオペレータによって正解であると判断された場合
は、結果編集部１８は文字列Ａをそのまま出力する。If the candidate word of the character string A does not exist, the result editing unit 18 sends the character string A to the display unit 20 via the control unit 24, and the display unit 20 displays this character string A. Here, the operator determines whether or not the character string A on the display screen is incorrect, and if it is incorrect, the correct word is input by a keyboard or the like. Then, the result editing unit 18 receives the correct word described above via the input unit 22 and the control unit 24 and exchanges it for an incorrect character string. In addition, when the operator determines that the character string A displayed on the display unit 20 is the correct answer, the result editing unit 18 outputs the character string A as it is.

【００２２】この発明は上述した実施例にのみ限定され
るものではなく、従って各構成成分の構成、動作、処理
内容、入出力信号及び数値的条件を任意好適に変更して
よい。例えば上述した実施例では単語登録部において利
用者は一般知識辞書に単語情報を追加登録したが、もう
１つの利用者専用の知識辞書を用意してそこに必要な単
語情報を登録するようにしてもよい。The present invention is not limited to the above-mentioned embodiments, and therefore, the configuration, operation, processing content, input / output signal and numerical condition of each component may be arbitrarily changed. For example, in the above-mentioned embodiment, the user additionally registered the word information in the general knowledge dictionary in the word registration section, but another knowledge dictionary dedicated to the user is prepared and the necessary word information is registered therein. Good.

【００２３】さらに上述した実施例では評価値Ｊとし
て、文字列の各候補文字に付された候補順位の和Ｓを文
字列を構成する文字の総個数Ｎで割った値を用いたが、
候補順位の和Ｓにかえて各候補順位に対して対応した得
点（例えば候補順位１に対して１００点、候補順位２に
対して９０点を対応付けるというように候補順位が下が
るにつれて低くなる得点を対応付ける）の和を用いるよ
うにしてもよい。或いは候補順位の和Ｓにかえて文字列
の各候補文字の出現頻度（この場合出現頻度はあらかじ
め認識部が保有する）の和を用いるようにしてもよい。
或いは候補順位の和Ｓにかえて、候補文字と当該候補文
字に対応する文字パタンとの間の類似度を求め文字列の
各候補文字の前記類似度の和を用いてもよい。或いは候
補文字の和Ｓにかえて候補文字の辞書マトリクスと当該
候補文字に対応する文字パタンの特徴量との間の距離を
求め文字列の各候補文字の前記距離の和を用いるように
してもよい。或いは候補順位の和Ｓにかえて、文字列の
各候補文字の出現頻度の和と候補順位の和を用いるよう
にしてもよい。Further, in the above-described embodiment, the evaluation value J is a value obtained by dividing the sum S of the candidate ranks given to the respective candidate characters of the character string by the total number N of the characters constituting the character string.
Scores corresponding to each candidate rank instead of the sum S of candidate ranks (for example, 100 points for candidate rank 1 and 90 points for candidate rank 2 are associated with lower scores as the candidate rank decreases). The sum of (corresponding) may be used. Alternatively, instead of the sum S of the candidate ranks, the sum of the appearance frequencies of the candidate characters in the character string (in this case, the appearance frequency is held in advance by the recognition unit) may be used.
Alternatively, instead of the sum S of the candidate ranks, the similarity between the candidate character and the character pattern corresponding to the candidate character may be obtained and the sum of the similarities of the candidate characters in the character string may be used. Alternatively, instead of the sum S of the candidate characters, the distance between the dictionary matrix of the candidate characters and the feature amount of the character pattern corresponding to the candidate character is obtained, and the sum of the distances of the candidate characters of the character string is used. Good. Alternatively, instead of the sum S of the candidate ranks, the sum of the appearance frequencies of the candidate characters in the character string and the sum of the candidate ranks may be used.

【００２４】また表示部の表示方法を上述のもののほ
か、例えば異なる色、異なる輝度、ブリンキング及びア
ンダーラインのうちのいずれか一つまたは複数を用い
て、表示を変化させるようにしてもよい。In addition to the above-described display method of the display unit, the display may be changed by using one or more of different colors, different brightness, blinking, and underlining, for example.

【００２５】また後処理部は、単語情報を用いた知識処
理を上述のほかつぎに述べるように行ってもよい。候補
文字から成る文字列に対応する単語が単語情報の中に存
在するか否か調べるため、知識処理一単位文の文字列を
単語情報の単語と照合し、これら文字列及び単語の間の
類似度或いは不一致度を算出する。文字列に対応する単
語として例えば文字列との類似度が所定の閾値を越える
単語或いは文字列との不一致度が所定の閾値を越えない
単語を検出する。そして、（１）類似度が所定の閾値を越える文字列或いは不一致
度が所定の閾値を越えない文字列を検出した場合には、
この検出した文字列のうち最大の類似度或いは最小の不
一致度を検出し、この最大の類似度或いは最小の不一致
度の文字列に対応する単語情報の単語を知識処理結果、
及びこの最大の類似度或いは最小の不一致度を知識処理
の評価値として出力する。（２）知識処理一単位分の文字列のすべてを単語情報の
単語と照合し終えても類似度が所定の閾値を越える文字
列、或いは不一致度が所定の閾値を越えない文字列を１
個も検出できなっかた場合には、候補順位が１位となる
候補文字の組み合わせの文字列を知識処理結果、及び類
似度のあらかじめ定めた下限値或いは不一致度のあらか
じめ定めた上限値を評価値として出力する。これら類似度の下限値及び不一致度の上限値は候補文字
から成る文字列に対応する単語が単語情報のなかに存在
しなかったことを表わす。Further, the post-processing section may perform knowledge processing using word information as described below in addition to the above. In order to check whether the word corresponding to the character string consisting of the candidate characters exists in the word information, the character string of the knowledge processing one unit sentence is compared with the word of the word information, and the similarity between these character strings and words is compared. Degree or inconsistency. As a word corresponding to the character string, for example, a word whose similarity to the character string exceeds a predetermined threshold value or a word whose dissimilarity to the character string does not exceed a predetermined threshold value is detected. Then, (1) when a character string whose similarity exceeds a predetermined threshold value or a character string whose degree of disagreement does not exceed a predetermined threshold value is detected,
The maximum similarity or the minimum dissimilarity is detected from the detected character strings, and the word of the word information corresponding to the character string having the maximum similarity or the minimum disagreement is used as the knowledge processing result,
Also, the maximum similarity or the minimum dissimilarity is output as the evaluation value of the knowledge processing. (2) Knowledge processing One character string whose similarity exceeds a predetermined threshold value or whose dissimilarity degree does not exceed a predetermined threshold value even if all the character strings for one unit of knowledge processing have been matched with the words of the word information.
If no individual can be detected, the knowledge processing result of the character string of the combination of the candidate characters having the first candidate rank is evaluated, and the predetermined lower limit value of the similarity or the predetermined upper limit value of the dissimilarity is evaluated. Output as a value. The lower limit value of the similarity and the upper limit value of the dissimilarity indicate that the word corresponding to the character string made up of the candidate characters does not exist in the word information.

【００２６】さらに上述した実施例では単語情報を用い
た知識処理の例につき説明したが文脈情報そのほかの知
識情報を用いた知識処理を行なう文字認識装置にこの発
明を適用してもよい。Furthermore, in the above-mentioned embodiment, an example of knowledge processing using word information has been described, but the present invention may be applied to a character recognition device that performs knowledge processing using context information and other knowledge information.

【００２７】[0027]

【発明の効果】上述したようにこの発明によれば、候補
単語から正解を選択する際の表示部の表示は、知識辞書
に最初から存在した単語情報の単語が候補単語として出
力された場合と、単語登録部で登録された単語情報の単
語が候補単語として出力された場合とで変化し、候補単
語を選択する作業において最初から知識辞書に存在した
単語情報の単語と、利用者が登録した単語情報の単語と
を簡単に見分けることができるので、オペレーターは正
解の可能性の高い候補単語を瞬時に発見し訂正、確認等
の作業を行える。その結果、帳票または文書の訂正、確
認の処理に要する時間が短縮される。従って、高速かつ
迅速に帳票または文書を処理できる操作性の良い文字認
識装置を提供できる。As described above, according to the present invention, when the correct answer is selected from the candidate words, the display on the display unit is the same as when the word of the word information existing from the beginning in the knowledge dictionary is output as the candidate word. , The word of the word information registered in the word registration unit changes when it is output as a candidate word, and the word of the word information existing in the knowledge dictionary from the beginning in the work of selecting the candidate word and the user registered it. Since it can be easily distinguished from the word of the word information, the operator can instantly find a candidate word having a high possibility of correct answer and perform correction, confirmation and the like. As a result, the time required to correct and confirm the form or document is shortened. Therefore, it is possible to provide a character recognizing device which is capable of processing a form or a document at high speed and with good operability.

[Brief description of drawings]

【図１】本発明の実施例の構成を示す機能ブロック図で
ある。FIG. 1 is a functional block diagram showing a configuration of an exemplary embodiment of the present invention.

【図２】帳票の一例を示す図である。FIG. 2 is a diagram showing an example of a form.

【図３】候補単語選択の一例を示す図である。FIG. 3 is a diagram showing an example of candidate word selection.

[Explanation of symbols]

１０文字認識装置１２認識部１４単語登録部１６後処理部１８結果編集部２０表示部２２入力部２４制御部２６光電変換部２８画像メモリ 10 character recognition device 12 recognition unit 14 word registration unit 16 post-processing unit 18 result editing unit 20 display unit 22 input unit 24 control unit 26 photoelectric conversion unit 28 image memory

Claims

[Claims]

1. A recognition unit for outputting a recognition result of a character pattern cut out from image data of a quantized form or document, a word registration unit for registering word information in a knowledge dictionary, and the knowledge dictionary. A post-processing unit that outputs a knowledge processing result and a candidate word based on the recognition result, a result editing unit that edits the post-processing result, a display unit that displays the candidate word, and an input unit that inputs a correct word. In a character recognition device comprising, the display unit, when the candidate word is a word of the word information that existed in the knowledge dictionary from the beginning, and is a word of the word information registered in the word registration unit A character recognition device characterized in that the display method of the word is changed.