JPH07129713A

JPH07129713A - Character recognition device

Info

Publication number: JPH07129713A
Application number: JP5273460A
Authority: JP
Inventors: Tamotsu Maeda; 保前田
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1993-11-01
Filing date: 1993-11-01
Publication date: 1995-05-19

Abstract

PURPOSE:To provide a character recognition device capable of character recognition processing with a high precision without troubling an operator. CONSTITUTION:A binarizing part 12 which binarizes multilevel picture data stored in a multilevel picture memory 11 by plural thresholds and plural picture memories 14, 15, and 16 for storing plural binary picture data obtained by the binarizing part 12 are provided. Further, a recognition precision calculating part 23 which calculates the character recognition precision with respect to plural binary picture data stored in plural picture memories is provided. Binary picture data having the highest recognition precision out of binary picture data stored in plural picture memories 14, 15, and 16 is selected by a discriminating part 25. The result of the character recognition processing related to selected binary picture data is displayed on a display part 26.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、画像入力装置で読み取
られた文書画像において文字を認識する文字認識装置に
関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character recognition device for recognizing characters in a document image read by an image input device.

【０００２】[0002]

【従来の技術】図９は従来の文字認識装置の機能構成を
示すブロック図である。2. Description of the Related Art FIG. 9 is a block diagram showing a functional configuration of a conventional character recognition device.

【０００３】図９において、イメージスキャナ１０によ
り原稿の文書画像が読み取られ、読み取られた文書画像
が多値画像データとして２値化部１２に与えられる。２
値化部１２は、その多値画像データに対して所定のしき
い値で２値化処理を行ない、得られた２値画像データを
２値画像メモリ１に与える。すなわち、２値化部１２
は、多値画像データが所定のしきい値よりも大きい場合
には“１”を２値画像メモリ１に記憶させ、多値画像デ
ータが所定のしきい値よりも小さい場合には“０”を２
値画像メモリ１に記憶させる。In FIG. 9, a document image of an original is read by an image scanner 10, and the read document image is given to a binarizing unit 12 as multivalued image data. Two
The binarizing unit 12 binarizes the multi-valued image data with a predetermined threshold value and supplies the obtained binary image data to the binary image memory 1. That is, the binarization unit 12
Stores "1" in the binary image memory 1 when the multi-valued image data is larger than a predetermined threshold value, and "0" when the multi-valued image data is smaller than the predetermined threshold value. 2
The value image memory 1 is stored.

【０００４】領域指定部１８により２値画像メモリ１内
の２値画像データのうち認識すべき領域の範囲が指定さ
れる。文字切出し部１９は、領域指定部１８により指定
された領域内の画像パターンから文字パターンを切り出
す。特徴抽出部２０は、その文字パターンから文字の特
徴を抽出する。The area specifying unit 18 specifies the range of the area to be recognized in the binary image data in the binary image memory 1. The character cutout unit 19 cuts out a character pattern from the image pattern in the area designated by the area designation unit 18. The feature extraction unit 20 extracts the feature of the character from the character pattern.

【０００５】文字認識部２１は、特徴抽出部２０により
抽出された文字の特徴を認識辞書２２に格納された標準
特徴と比較し、最も高い類似度を有する標準特徴の文字
を認識候補文字として表示部２６に表示させる。The character recognition unit 21 compares the character features extracted by the feature extraction unit 20 with the standard features stored in the recognition dictionary 22, and displays the character of the standard feature having the highest similarity as a recognition candidate character. It is displayed on the section 26.

【０００６】[0006]

【発明が解決しようとする課題】上記の従来の文字認識
装置では、２値化部１２による２値化の際にしきい値が
適当でないと、２値化された画像において潰れやかすれ
が生じて、文字認識部２１による文字の認識精度が低下
するという問題がある。In the above-described conventional character recognition apparatus, if the threshold value is not appropriate when binarizing by the binarizing unit 12, the binarized image may be crushed or blurred. However, there is a problem that the accuracy of character recognition by the character recognition unit 21 decreases.

【０００７】この場合に、従来の文字認識装置で認識精
度を確保するためには、操作者が２値化部１２のしきい
値を変化させつつイメージスキャナ１０による原稿の読
み取りおよび文字の認識を繰り返さなければならない。In this case, in order to ensure the recognition accuracy in the conventional character recognition device, the operator changes the threshold value of the binarization unit 12 while reading the document and recognizing the character by the image scanner 10. Must be repeated.

【０００８】それゆえに、本発明の目的は、操作者を煩
わすことなく自動的に原稿の文字を高い精度で認識する
ことができる文字認識装置を提供することを目的とす
る。Therefore, an object of the present invention is to provide a character recognition device which can automatically recognize the characters of a document with high accuracy without bothering the operator.

【０００９】[0009]

【課題を解決するための手段】（１）第１の発明第１の発明に係る文字認識装置は、第１の記憶手段、２
値化手段、第２の記憶手段、認識精度計算手段および選
択手段を備える。Means for Solving the Problems (1) First Invention A character recognition device according to a first invention comprises a first storage means and a second storage means.
A value conversion means, a second storage means, a recognition accuracy calculation means, and a selection means are provided.

【００１０】第１の記憶手段は、読み取られた文書画像
を多値画像データとして記憶する。２値化手段は、第１
の記憶手段に記憶された多値画像データを複数の異なる
しきい値で２値化して複数の２値画像データを得る。第
２の記憶手段は、２値化手段により得られた複数の２値
画像データを記憶する。認識精度計算手段は、第２の記
憶手段に記憶された複数の２値画像データに関してそれ
ぞれ文字の認識精度を計算する。選択手段は、認識精度
計算手段により得られた複数の認識精度を比較し、第２
の記憶手段に記憶された複数の２値画像データのうち最
も高い認識精度を有する２値画像データを選択する。
（２）第２の発明第２の発明に係る文字認識装置は、読取り手段、２値化
手段、記憶手段、認識精度計算手段および制御手段を備
える。The first storage means stores the read document image as multi-valued image data. The binarizing means is the first
The multi-valued image data stored in the storage means is binarized with a plurality of different threshold values to obtain a plurality of binary image data. The second storage means stores a plurality of binary image data obtained by the binarization means. The recognition accuracy calculation means calculates character recognition accuracy for each of the plurality of binary image data stored in the second storage means. The selection means compares the plurality of recognition accuracies obtained by the recognition accuracy calculation means, and
The binary image data having the highest recognition accuracy is selected from the plurality of binary image data stored in the storage means.
(2) Second Invention A character recognition device according to the second invention comprises a reading means, a binarization means, a storage means, a recognition accuracy calculation means and a control means.

【００１１】読取り手段は、文書画像を多値画像データ
として読取る。２値化手段は、読取り手段により読み取
られた多値画像データを所定のしきい値で２値化して２
値画像データを得る。記憶手段は、２値化手段により得
られた２値画像データを記憶する。認識精度計算手段
は、記憶手段に記憶された２値画像データに関して文字
の認識精度を計算する。制御手段は、認識精度計算手段
により得られた認識精度を所定の値と比較し、認識精度
が所定の値よりも低い場合に読取り手段に再度文書画像
を読み取らせるとともに２値化手段にしきい値を変更し
て２値化を行なわせる。（３）第３の発明第３の発明に係る文字認識装置は、第１の記憶手段、し
きい値決定手段、２値化手段、第２の記憶手段、認識精
度計算手段および制御手段を備える。The reading means reads the document image as multi-valued image data. The binarizing means binarizes the multi-valued image data read by the reading means with a predetermined threshold value to obtain 2
Get value image data. The storage means stores the binary image data obtained by the binarization means. The recognition accuracy calculation means calculates the character recognition accuracy for the binary image data stored in the storage means. The control means compares the recognition accuracy obtained by the recognition accuracy calculation means with a predetermined value, causes the reading means to read the document image again when the recognition accuracy is lower than the predetermined value, and causes the binarization means to set a threshold value. Is changed to perform binarization. (3) Third Invention A character recognition device according to a third invention comprises a first storage means, a threshold value determination means, a binarization means, a second storage means, a recognition accuracy calculation means and a control means. .

【００１２】第１の記憶手段は、読み取られた文書画像
を多値画像データとしてを記憶する。２値化手段は、第
１の記憶手段に記憶された多値画像データをしきい値決
定手段により決定されたしきい値で２値化して２値画像
データを得る。第２の記憶手段は、２値化手段により得
られた２値画像データを記憶する。認識精度計算手段
は、第２の記憶手段に記憶された２値画像データに関し
て文字の認識精度を計算する。制御手段は、認識精度計
算手段により得られた認識精度を所定の値と比較し、認
識精度が所定の値よりも低い場合にしきい値決定手段に
しきい値を変更させるとともに２値化手段に変更された
しきい値で２値化を行なわせる。The first storage means stores the read document image as multi-valued image data. The binarizing means binarizes the multi-valued image data stored in the first storage means with the threshold value determined by the threshold value determining means to obtain binary image data. The second storage means stores the binary image data obtained by the binarization means. The recognition accuracy calculation means calculates character recognition accuracy for the binary image data stored in the second storage means. The control means compares the recognition accuracy obtained by the recognition accuracy calculation means with a predetermined value, and when the recognition accuracy is lower than the predetermined value, causes the threshold value determination means to change the threshold value and changes to the binarization means. The binarization is performed with the set threshold value.

【００１３】[0013]

【作用】（１）第１の発明第１の発明に係る文字認識装置においては、多値画像デ
ータが複数の異なるしきい値で２値化され、得られた複
数の２値画像データに関してそれぞれ文字の認識精度が
計算される。そして、複数の２値画像データのうち最大
の認識精度を有する２値画像データが選択される。これ
により、最も認識精度の高い２値画像データが自動的に
求められる。（２）第２の発明第２の発明に係る文字認識装置においては、自動的にし
きい値を変更しつつ文書画像の読み取りおよび多値画像
データの２値化が繰り返される。これにより、認識精度
の高い２値画像データが自動的に求められる。（３）
第３の発明第３の発明に係る文字認識装置においては、自動的にし
きい値を変更しつつ多値画像データの２値化が繰り返さ
れる。これにより、認識精度の高い２値画像データが自
動的に求められる。(1) First invention In the character recognition apparatus according to the first invention, multi-valued image data is binarized with a plurality of different threshold values, and the obtained plurality of binary image data are respectively processed. Character recognition accuracy is calculated. Then, the binary image data having the maximum recognition accuracy is selected from the plurality of binary image data. Thereby, the binary image data with the highest recognition accuracy is automatically obtained. (2) Second Invention In the character recognition device according to the second invention, reading of a document image and binarization of multi-valued image data are repeated while automatically changing the threshold value. Thereby, binary image data with high recognition accuracy is automatically obtained. (3)
Third Invention In the character recognition device according to the third invention, binarization of multi-valued image data is repeated while automatically changing the threshold value. Thereby, binary image data with high recognition accuracy is automatically obtained.

【００１４】[0014]

【実施例】図１は、本発明の第１の実施例における文字
認識装置の機能構成を示すブロック図である。FIG. 1 is a block diagram showing the functional arrangement of a character recognition apparatus according to the first embodiment of the present invention.

【００１５】この文字認識装置においては、イメージス
キャナ１０と２値化部１２との間に多値画像データを記
憶する多値画像メモリ１１が設けられる。この実施例の
２値化部１２は、３種類のしきい値で多値画像データを
２値化して３つの２値画像データを得る。２値画像メモ
リ１３は、３つの２値画像データに対応して、第１画像
メモリ１４、第２画像メモリ１５および第３画像メモリ
１６を含む。第１画像メモリ１４、第２画像メモリ１５
および第３画像メモリ１６は、２値化部１２により得ら
れた３つの２値画像データをそれぞれ記憶する。In this character recognition device, a multi-valued image memory 11 for storing multi-valued image data is provided between the image scanner 10 and the binarization unit 12. The binarization unit 12 of this embodiment binarizes the multi-valued image data with three types of threshold values to obtain three binary image data. The binary image memory 13 includes a first image memory 14, a second image memory 15 and a third image memory 16 corresponding to the three binary image data. First image memory 14, second image memory 15
And the third image memory 16 stores the three binary image data obtained by the binarization unit 12, respectively.

【００１６】２値画像メモリ１３と領域指定部１８との
間には、第１画像メモリ１４、第２画像メモリ１５およ
び第３画像メモリ１６内の２値画像データのいずかを選
択する画像選択部１７が設けられる。Between the binary image memory 13 and the area designating section 18, an image for selecting one of the binary image data in the first image memory 14, the second image memory 15 and the third image memory 16 is selected. A selection unit 17 is provided.

【００１７】領域指定部１８、文字切出し部１９、特徴
抽出部２０、文字認識部２１および認識辞書２２の機能
は、図９に示される従来の文字認識装置における対応す
る部分の機能と同様である。The functions of the area designating unit 18, the character cutting unit 19, the feature extracting unit 20, the character recognizing unit 21, and the recognition dictionary 22 are the same as those of the corresponding parts in the conventional character recognizing device shown in FIG. .

【００１８】文字認識部２１と表示部２６との間には、
文字認識部２１により得られた認識結果の精度を計算す
る認識精度計算部２３、記憶処理を制御する記憶処理部
２４および各種判定を行なう判定部２５が設けられる。
判定部２５が選択手段を構成する。Between the character recognition section 21 and the display section 26,
A recognition accuracy calculation unit 23 that calculates the accuracy of the recognition result obtained by the character recognition unit 21, a storage processing unit 24 that controls storage processing, and a determination unit 25 that makes various determinations are provided.
The determination unit 25 constitutes a selection unit.

【００１９】図２は、第１の実施例による文字認識装置
のハードウエア構成を示すブロック図である。図１およ
び図２において同一符号は同一または相当部分を示す。FIG. 2 is a block diagram showing the hardware configuration of the character recognition device according to the first embodiment. 1 and 2, the same reference numerals indicate the same or corresponding parts.

【００２０】この文字認識装置は、イメージスキャナ１
０、中央演算処理装置１００、表示部２６、キーボード
１０１、リードオンリメモリ１０２およびランダムアク
セスメモリ１０５からなる。This character recognition device is provided with an image scanner 1.
0, the central processing unit 100, the display unit 26, the keyboard 101, the read only memory 102 and the random access memory 105.

【００２１】リードオンリメモリ１０２は、認識辞書２
２、プログラム記憶領域１０３およびしきい値記憶領域
１０４を含む。認識辞書２２には、種々の文字の標準特
徴が格納される。プログラム記憶領域１０３には、図１
に示した各ブロックの機能を実行するためのプログラム
が記憶される。しきい値記憶領域１０４には、図１の２
値化部１２による２値化の際に用いられる複数のしきい
値が格納される。このしきい値は、３値以上からなる画
素を白画素または黒画素に分類する際の基準値である。The read-only memory 102 stores the recognition dictionary 2
2. It includes a program storage area 103 and a threshold storage area 104. The recognition dictionary 22 stores standard features of various characters. In the program storage area 103, FIG.
A program for executing the function of each block shown in is stored. In the threshold storage area 104, 2 in FIG.
A plurality of threshold values used for binarization by the binarization unit 12 are stored. This threshold value is a reference value when classifying pixels having three or more values into white pixels or black pixels.

【００２２】ランダムアクセスメモリ１０５は、２値画
像メモリ１３、第１文字コード領域１０６、第２文字コ
ード領域１０７、第３文字コード領域１０８、最終文字
コード領域１０９、画像メモリ指示フラグ１１０、多値
画像メモリ１１、作業領域１１１、第１認識精度領域１
１２、第２認識精度領域１１３および第３認識精度領域
１１４を含む。The random access memory 105 includes a binary image memory 13, a first character code area 106, a second character code area 107, a third character code area 108, a final character code area 109, an image memory instruction flag 110, and a multivalued. Image memory 11, work area 111, first recognition accuracy area 1
12, a second recognition accuracy region 113 and a third recognition accuracy region 114 are included.

【００２３】２値画像メモリ１３は図１に示した２値画
像メモリ１３に相当する。第１文字コード領域１０６、
第２文字コード領域１０７および第３文字コード領域１
０８には、それぞれ第１画像メモリ１４、第２画像メモ
リ１５および第３画像メモリ１６内の２値画像データに
関する文字認識により得られた認識候補文字の文字コー
ドがそれぞれ記憶される。The binary image memory 13 corresponds to the binary image memory 13 shown in FIG. The first character code area 106,
Second character code area 107 and third character code area 1
In 08, the character codes of the recognition candidate characters obtained by the character recognition regarding the binary image data in the first image memory 14, the second image memory 15, and the third image memory 16, respectively are stored.

【００２４】最終文字コード領域１０９には、第１文字
コード領域１０６、第２文字コード領域１０７および第
３文字コード領域１０８に記憶される文字コードのう
ち、最も高い認識精度で得られた文字コードが記憶され
る。多値画像メモリ１１は、図１に示した多値画像メモ
リ１１に相当する。In the final character code area 109, among the character codes stored in the first character code area 106, the second character code area 107 and the third character code area 108, the character code obtained with the highest recognition accuracy. Is memorized. The multi-valued image memory 11 corresponds to the multi-valued image memory 11 shown in FIG.

【００２５】第１認識精度領域１１２には、第１画像メ
モリ１４内の２値画像データに関する文字認識における
認識精度が記憶される。第２認識精度領域１１３には、
第２画像メモリ１５内の２値画像データに関する文字認
識における認識精度が記憶される。第３認識精度領域１
１４には、第３画像メモリ１６内の２値画像データに関
する文字認識における認識精度が記憶される。The first recognition accuracy area 112 stores the recognition accuracy in character recognition regarding the binary image data in the first image memory 14. In the second recognition accuracy area 113,
The recognition accuracy in character recognition regarding the binary image data in the second image memory 15 is stored. Third recognition accuracy area 1
The recognition accuracy in character recognition regarding the binary image data in the third image memory 16 is stored in 14.

【００２６】画像メモリ指示フラグ１１０は、第１画像
メモリ１４、第２画像メモリ１５および第３画像メモリ
１６に記憶された２値画像データのうち、図１の文字切
出し部１９に入力すべき２値画像データを示す。具体的
には、画像メモリ指示フラグ１１０は、文字認識処理に
おいて第１画像メモリ１４内の２値画像データ、第２画
像メモリ１５内の２値画像データおよび第３画像メモリ
１６内の２値画像データを特定する場合に、それぞれ
“−１”、“−２”および“−３”にセットされる。The image memory instruction flag 110 is one of binary image data stored in the first image memory 14, the second image memory 15 and the third image memory 16, and should be input to the character cutout unit 19 of FIG. Value image data is shown. Specifically, the image memory instruction flag 110 indicates that the binary image data in the first image memory 14, the binary image data in the second image memory 15, and the binary image in the third image memory 16 are used in the character recognition process. When specifying data, they are set to "-1,""-2," and "-3," respectively.

【００２７】また、画像メモリ指示フラグ１１０は、第
１画像メモリ１４内の２値画像データに関する文字認識
の認識精度が最も高いと判明したときに“１”にセット
され、第２画像メモリ１５内の２値画像データに関する
文字認識の認識精度が最も高いと判明したときに“２”
にセットされ、第３画像メモリ１６内の２値画像データ
に関する文字認識の認識精度が最も高いと判明したとき
に“３”にセットされる。The image memory instruction flag 110 is set to "1" when it is determined that the character recognition accuracy of the binary image data in the first image memory 14 is the highest, and the image memory instruction flag 110 in the second image memory 15 is set. "2" when it is found that the recognition accuracy of character recognition for binary image data of
And is set to "3" when it is found that the recognition accuracy of the character recognition regarding the binary image data in the third image memory 16 is the highest.

【００２８】次に、図１および図２に示した文字認識装
置の動作を図３のフローチャートを参照しながら説明す
る。Next, the operation of the character recognition apparatus shown in FIGS. 1 and 2 will be described with reference to the flowchart of FIG.

【００２９】まず、ステップＳ１では、イメージスキャ
ナ１０により原稿から文書画像が読み取られ、多値画像
データとして多値画像メモリ１１に記憶される。ステッ
プＳ２では、２値化部１２が、しきい値記憶領域１０４
に格納された３種類のしきい値を用いて多値画像メモリ
１１内の多値画像データを３つの２値画像データに変換
し、変換された２値画像データを第１画像メモリ１４、
第２画像メモリ１５および第３画像メモリ１６にそれぞ
れ記憶する。First, in step S1, the document image is read from the original by the image scanner 10 and stored in the multivalued image memory 11 as multivalued image data. In step S2, the binarization unit 12 causes the threshold storage area 104
Converting the multi-valued image data in the multi-valued image memory 11 into three binary image data using the three types of threshold values stored in the first image memory 14;
It is stored in the second image memory 15 and the third image memory 16, respectively.

【００３０】たとえば、多値画像データが２５６階調で
あるとき、しきい値として“６４”、“１２８”および
“１９２”という適当な３つの値を予め定めておく。第
１画像メモリ１４では、１画素の濃度が“６４”よりも
大きい画素について“１”が記憶され、１画素の濃度が
“６４”よりも小さい画素について“０”が記憶され
る。同様に、第２画像メモリ１５では１画素の濃度が
“１２８”よりも大きい画素について“１”が記憶さ
れ、１画素の濃度が“１２８”よりも小さい画素につい
て“０”が記憶される。第３画像メモリ１６では、１画
素の濃度が“１９２”よりも大きい画素について“１”
が記憶され、１画素の濃度が“１９２”よりも小さい画
素について“０”が記憶される。For example, when the multi-valued image data has 256 gradations, three appropriate threshold values of "64", "128" and "192" are predetermined. In the first image memory 14, "1" is stored for pixels having a density of one pixel larger than "64", and "0" is stored for pixels having a density of one pixel smaller than "64". Similarly, in the second image memory 15, "1" is stored for a pixel whose density of one pixel is higher than "128", and "0" is stored for a pixel whose density of one pixel is lower than "128". In the third image memory 16, “1” is set for pixels in which the density of one pixel is larger than “192”.
Is stored, and “0” is stored for a pixel in which the density of one pixel is smaller than “192”.

【００３１】そして、ステップＳ３で、画像メモリ指示
フラグ１１０が“−１”にセットされる。ステップＳ４
では、画像選択部１７が、画像メモリ指示フラグ１１０
に基づいて、第１画像メモリ１４、第２画像メモリ１５
および第３画像メモリ１６に記憶された２値画像データ
のうち、どの２値画像データに対して文字認識処理を行
なうべきかを選択する。さらに、領域指定部１８が、画
像選択部１７により選択された画像メモリ内の２値画像
データにおいて認識対象領域を指定する。たとえば、第
１画像メモリ１４に記憶された２値画像データにおいて
点（１０，１２０）と点（１００，３３０）とを結ぶ直
線を対角線とする長方形の認識対象領域を指定すること
ができる。このステップＳ４の処理の詳細については図
４のフローチャートを用いて後述する。Then, in step S3, the image memory instruction flag 110 is set to "-1". Step S4
Then, the image selection unit 17 causes the image memory instruction flag 110 to be displayed.
Based on the first image memory 14 and the second image memory 15
Then, of the binary image data stored in the third image memory 16, which binary image data is to be subjected to the character recognition process is selected. Further, the area designating section 18 designates a recognition target area in the binary image data in the image memory selected by the image selecting section 17. For example, in the binary image data stored in the first image memory 14, it is possible to specify a rectangular recognition target area whose diagonal is a straight line connecting the point (10, 120) and the point (100, 330). Details of the process of step S4 will be described later with reference to the flowchart of FIG.

【００３２】ステップＳ５では、文字切出し部１９が、
領域指定部１８により指定された認識対象領域から文字
パターンを切り出す。ステップＳ６では、特徴抽出部２
０が、文字切出し部１９により切り出された文字パター
ンのうち予め定められた数の文字パターンについて各文
字の特徴を抽出する。ステップＳ７では、文字認識部２
１が、特徴抽出部２０により抽出された各文字の特徴を
認識辞書２２に格納された標準特徴と比較し、類似度を
決定する。そして、最大類似度を有する標準特徴の文字
を認識候補文字として決定する。In step S5, the character cutting section 19
A character pattern is cut out from the recognition target area designated by the area designating unit 18. In step S6, the feature extraction unit 2
0 extracts the characteristics of each character with respect to a predetermined number of character patterns cut out by the character cutting unit 19. In step S7, the character recognition unit 2
1 compares the feature of each character extracted by the feature extraction unit 20 with the standard feature stored in the recognition dictionary 22, and determines the degree of similarity. Then, the standard feature character having the maximum similarity is determined as the recognition candidate character.

【００３３】ステップＳ８では、認識精度計算部２３が
文字認識部２１による文字認識の認識精度を計算する。
一般に、文字認識の際に、最大類似度が所定値より大き
いときまたは最大類似度と次に大きい類似度との差（以
下、類似度差と呼ぶ）が所定値よりも大きいときには認
識候補文字が正解であり、最大類似度および類似度差が
それぞれ所定値よりも小さいときには認識候補文字が正
解でないという傾向がある。そこで、各文字について、
最大類似度または類似度差がそれぞれ所定値よりも大き
いときにはその文字は「確定」可能であるとし、最大類
似度および類似度差がそれぞれ所定値よりも小さいとき
にはその文字は「確定」不可能であるとして、確定文字
数および未確定文字数を計算する。そして、認識精度を
確定文字数／（確定文字数＋未確定文字数）で定義す
る。In step S8, the recognition accuracy calculation unit 23 calculates the recognition accuracy of the character recognition by the character recognition unit 21.
Generally, in character recognition, when the maximum similarity is larger than a predetermined value or when the difference between the maximum similarity and the next largest similarity (hereinafter referred to as similarity difference) is larger than a predetermined value, the recognition candidate character is The answer is correct, and when the maximum similarity and the difference in similarity are smaller than predetermined values, the recognition candidate character tends to be incorrect. So for each character,
When the maximum similarity or similarity difference is larger than a predetermined value, the character can be "determined", and when the maximum similarity or similarity difference is smaller than a predetermined value, the character cannot be "determined". Assuming that there is, the number of confirmed characters and the number of unconfirmed characters are calculated. Then, the recognition accuracy is defined by the number of confirmed characters / (the number of confirmed characters + the number of unconfirmed characters).

【００３４】ステップＳ９では、記憶処理部２４が、ス
テップＳ７で求められた認識候補文字の文字コードを第
１文字コード領域１０６、第２文字コード領域１０７ま
たは第３文字コード領域１０８に記憶させ、ステップＳ
８で求められた認識精度を第１認識精度領域１１２、第
２認識精度領域１１３または第３認識精度領域１１４に
記憶させる。ステップＳ９の処理の詳細については図５
のフローチャートを用いて後述する。In step S9, the storage processing unit 24 stores the character code of the recognition candidate character obtained in step S7 in the first character code area 106, the second character code area 107 or the third character code area 108, Step S
The recognition accuracy obtained in 8 is stored in the first recognition accuracy area 112, the second recognition accuracy area 113, or the third recognition accuracy area 114. For details of the processing in step S9, see FIG.
It will be described later with reference to the flowchart of.

【００３５】次に、ステップＳ１０では、判定部２５
が、画像メモリ指示フラグ１１０が“−３”であるかど
うかを判断する。画像メモリ指示フラグ１１０が“−
３”のときにはステップＳ１２に進み、画像メモリ指示
フラグ１１０が“−１”または“−２”のときにはステ
ップＳ１１に進む。画像メモリ指示フラグ１１０が“−
１”または“−２”のときは、３つの２値画像メモリ１
４，１５，１６のいずれかについてまだ認識精度の計算
が終了していない状態である。Next, in step S10, the determination unit 25
Determines whether the image memory instruction flag 110 is "-3". The image memory instruction flag 110 is "-
When it is "3", the process proceeds to step S12, and when the image memory instruction flag 110 is "-1" or "-2", the process proceeds to step S11.
When it is "1" or "-2", three binary image memories 1
This is a state in which the calculation of the recognition accuracy has not been completed for any of 4, 15, and 16.

【００３６】ステップＳ１１では、判定部２５が、画像
メモリ指示フラグ１１０の値から１を減じた後、ステッ
プＳ４に戻り、画像メモリ指示フラグ１１０が“−３”
になるまでステップＳ４〜Ｓ９の処理が繰り返される。In step S11, the determination unit 25 subtracts 1 from the value of the image memory instruction flag 110, and then the process returns to step S4 and the image memory instruction flag 110 is "-3".
The processing of steps S4 to S9 is repeated until.

【００３７】ステップＳ１２では、判定部２５が、ステ
ップＳ８で求められた認識精度に基づいて、第１画像メ
モリ１４、第２画像メモリ１５および第３画像メモリ１
６のうち最も認識精度の高い２値画像データを記憶する
画像メモリを選択し、“１”、“２”および“３”のう
ち該当する値を画像メモリ指示フラグ１１０にセットす
る。In step S12, the determination section 25 determines the first image memory 14, the second image memory 15, and the third image memory 1 based on the recognition accuracy obtained in step S8.
The image memory for storing the binary image data having the highest recognition accuracy out of 6 is selected, and the corresponding value among “1”, “2” and “3” is set in the image memory instruction flag 110.

【００３８】ステップＳ１３では、記憶処理部２４が、
ステップＳ１２で選択された画像メモリに対応する文字
コード領域内の文字コードを最終文字コード領域１０９
に記憶させる。すなわち、第１画像メモリ１４が選択さ
れた場合には、第１文字コード領域１０６内の文字コー
ドが最終文字コード領域１０９に複写され、第２画像メ
モリ１５が選択された場合には第２文字コード領域１０
７内の文字コードが最終文字コード領域１０９に複写さ
れ、第３画像メモリ１６が選択された場合には第３文字
コード領域１０８内の文字コードが最終文字コード領域
１０９に複写される。In step S13, the storage processing unit 24
The character code in the character code area corresponding to the image memory selected in step S12 is set to the final character code area 109.
To memorize. That is, when the first image memory 14 is selected, the character code in the first character code area 106 is copied to the final character code area 109, and when the second image memory 15 is selected, the second character is selected. Code area 10
The character code in 7 is copied to the final character code area 109, and when the third image memory 16 is selected, the character code in the third character code area 108 is copied to the final character code area 109.

【００３９】ステップＳ１４では、特徴抽出部２０が、
ステップＳ１２で選択された画像メモリ内の２値画像デ
ータにおいて、ステップＳ６で抽出されなかった残りの
文字パターンから文字の特徴を抽出する。ステップＳ１
５では、文字切出し部１９が、後述するステップＳ２１
で指定された認識対象領域から文字パターンを切り出
す。ステップＳ１６では、特徴抽出部２０が、ステップ
Ｓ１５で切り出された文字パターンから文字の特徴を抽
出する。In step S14, the feature extraction unit 20
In the binary image data in the image memory selected in step S12, character features are extracted from the remaining character patterns not extracted in step S6. Step S1
In step 5, the character cutout unit 19 executes step S21 described later.
The character pattern is cut out from the recognition target area specified by. In step S16, the feature extraction unit 20 extracts the feature of the character from the character pattern cut out in step S15.

【００４０】ステップＳ１７では、文字認識部２１が、
ステップＳ１４またはステップＳ１６で抽出された文字
の特徴を認識辞書２２に格納された標準特徴と比較し、
類似度を決定する。そして、最大類似度を有する標準特
徴の文字の文字コードを決定する。ステップＳ１８で
は、記憶処理部２４が、ステップＳ１７で求められた文
字コードを最終文字コード領域１０９に記憶させる。ス
テップＳ１９では、判定部２５が、最終文字コード領域
１０９に記憶された文字コードを表示部２６に表示させ
る。In step S17, the character recognition unit 21
Comparing the character features extracted in step S14 or step S16 with the standard features stored in the recognition dictionary 22,
Determine the degree of similarity. Then, the character code of the standard feature character having the maximum similarity is determined. In step S18, the storage processing unit 24 stores the character code obtained in step S17 in the final character code area 109. In step S19, the determination unit 25 causes the display unit 26 to display the character code stored in the final character code area 109.

【００４１】ステップＳ２０では、他の領域に関して文
字認識処理を行なうかどうかを判断し、行なう場合には
ステップＳ２１に進み、それ以外の場合には処理を終了
する。ステップＳ２１では、領域指定部１８により認識
対象領域が指定され、ステップＳ１５に進む。In step S20, it is determined whether or not character recognition processing is to be performed on another area. If so, the process proceeds to step S21, and if not, the processing ends. In step S21, the recognition target area is specified by the area specifying unit 18, and the process proceeds to step S15.

【００４２】次に、図４のフローチャートに従って図３
のステップＳ４における画像選択部１７の処理を説明す
る。Next, referring to the flow chart of FIG.
The processing of the image selection unit 17 in step S4 of will be described.

【００４３】まず、画像メモリ指示フラグ１１０が“−
１”であるか否かを判別する（ステップＳ１００）。画
像メモリ指示フラグ１１０が“−１”のときには、文字
認識処理の対象として第１画像メモリ１４を選択する
（ステップＳ１０３）。この場合、領域指定部１８によ
り第１画像メモリ１４内の２値画像データにおいて認識
対象領域が指定される（ステップＳ１０４）。First, the image memory instruction flag 110 is set to "-.
It is determined whether or not it is 1 "(step S100). When the image memory instruction flag 110 is" -1 ", the first image memory 14 is selected as a target of the character recognition process (step S103). The area designation unit 18 designates a recognition target area in the binary image data in the first image memory 14 (step S104).

【００４４】ステップＳ１００で画像メモリ指示フラグ
１１０が“−１”でないときには、画像メモリ指示フラ
グ１１０が“−２”であるか否かを判別する（ステップ
Ｓ１０１）。画像メモリ指示フラグ１１０が“−２”の
ときには、文字認識処理の対象として第２画像メモリ１
５を選択する（ステップＳ１０５）。この場合には、第
１画像メモリ１４の選択時に指定された認識対象領域に
対して文字認識処理が行なわれる。When the image memory instruction flag 110 is not "-1" in step S100, it is determined whether or not the image memory instruction flag 110 is "-2" (step S101). When the image memory instruction flag 110 is "-2", the second image memory 1 is selected as the target of the character recognition process.
5 is selected (step S105). In this case, the character recognition processing is performed on the recognition target area designated when the first image memory 14 is selected.

【００４５】ステップＳ１０１で画像メモリ指示フラグ
１１０が“−２”でないときには、画像メモリ指示フラ
グ１１０が“−３”であるか否かを判別する（ステップ
Ｓ１０２）。画像メモリ指示フラグ１１０が“−３”の
ときには、文字認識処理の対象として第３画像メモリ１
６を選択する（ステップＳ１０６）。この場合にも、第
１画像メモリ１４の選択時に指定された認識対象領域に
対して文字認識処理が行なわれる。When the image memory instruction flag 110 is not "-2" in step S101, it is determined whether the image memory instruction flag 110 is "-3" (step S102). When the image memory instruction flag 110 is "-3", the third image memory 1 is selected as the target of the character recognition processing.
6 is selected (step S106). Also in this case, the character recognition process is performed on the recognition target area designated when the first image memory 14 is selected.

【００４６】ステップＳ１０２で画像メモリ指示フラグ
１１０が“−３”でないときには、処理を終了する。If the image memory instruction flag 110 is not "-3" in step S102, the process is terminated.

【００４７】次に、図５のフローチャートに従って図３
のステップＳ９における記憶処理部２４の処理を説明す
る。Next, referring to the flow chart of FIG.
The processing of the storage processing unit 24 in step S9 will be described.

【００４８】まず、画像メモリ指示フラグ１１０が“−
１”であるか否かを判別する（ステップＳ２００）。画
像メモリ指示フラグ１１０が“−１”のときには、ステ
ップＳ２０４に進む。ステップＳ２０４は第１画像メモ
リ１４内の２値画像データに対して文字認識処理を行な
った後になされる処理である。ステップＳ２０４では、
第１文字コード領域１０６に認識候補文字の文字コード
を記憶させ、ステップＳ２０５では、第１認識精度記憶
部１１２に第１画像メモリ１４内の２値画像データに関
する認識精度を記憶させる。First, the image memory instruction flag 110 is set to "-".
It is determined whether it is 1 "(step S200). When the image memory instruction flag 110 is" -1 ", the process proceeds to step S204. In step S204, the binary image data in the first image memory 14 is processed. This is a process performed after performing the character recognition process.
The character code of the recognition candidate character is stored in the first character code area 106, and in step S205, the recognition accuracy of the binary image data in the first image memory 14 is stored in the first recognition accuracy storage unit 112.

【００４９】ステップＳ２００で画像メモリ指示フラグ
１１０が“−１”でないときには、画像メモリ指示フラ
グ１１０が“−２”であるか否かを判別する（ステップ
Ｓ２０１）。画像メモリ指示フラグ１１０が“−２”の
ときには、ステップＳ２０６に進む。ステップＳ２０６
は第２画像メモリ１５内の２値画像データに対して文字
認識処理を行なった後になされる処理である。ステップ
Ｓ２０６では、第２文字コード領域１０７に認識候補文
字の文字コードを記憶させ、ステップＳ２０７では、第
２認識精度記憶部１１３に第２画像メモリ１５内の２値
画像データに関する認識精度を記憶させる。When the image memory instruction flag 110 is not "-1" in step S200, it is determined whether the image memory instruction flag 110 is "-2" (step S201). When the image memory instruction flag 110 is "-2", the process proceeds to step S206. Step S206
Is a process performed after the character recognition process is performed on the binary image data in the second image memory 15. In step S206, the character code of the recognition candidate character is stored in the second character code area 107, and in step S207, the recognition accuracy regarding the binary image data in the second image memory 15 is stored in the second recognition accuracy storage unit 113. .

【００５０】ステップＳ２０１で画像メモリ指示フラグ
１１０が“−２”でないときには、ステップＳ２０２に
進む。ステップＳ２０２は第３画像メモリ１６内の２値
画像データに対して文字認識処理を行なった後になされ
る処理である。ステップＳ２０２では、第３文字コード
領域１０８に認識候補文字の文字コードを記憶させ、ス
テップＳ２０３では、第３認識精度記憶部１１４に第３
画像メモリ１６内の２値画像データに関する認識精度を
記憶させる。When the image memory instruction flag 110 is not "-2" in step S201, the process proceeds to step S202. Step S202 is a process performed after the character recognition process is performed on the binary image data in the third image memory 16. In step S202, the character code of the recognition candidate character is stored in the third character code area 108, and in step S203, the character code of the recognition candidate character is stored in the third recognition accuracy storage unit 114.
The recognition accuracy regarding the binary image data in the image memory 16 is stored.

【００５１】ここで、具体例を用いて第１の実施例の文
字認識装置の動作を説明する。図６の（ａ）はイメージ
スキャナ１０から読み取られる原稿のレイアウトを模式
的に示した図であり、実際には、たとえば図６の（ｂ）
に示すように文字パターンや他の画像パターンが印刷さ
れている。Here, the operation of the character recognition apparatus of the first embodiment will be described using a specific example. FIG. 6A is a diagram schematically showing the layout of a document read by the image scanner 10. In practice, for example, FIG.
Character patterns and other image patterns are printed as shown in FIG.

【００５２】図６の（ａ）の原稿がイメージスキャナ１
０により読み取られ、多値画像メモリ１１に多値画像デ
ータとして記憶される。その後、多値画像メモリ１１内
の多値画像データが２値化部１２により３種類のしきい
値を用いて３つの２値画像データに変換され、それぞれ
第１画像メモリ１４、第２画像メモリ１５および第３画
像メモリ１６に記憶される。The original shown in FIG. 6A is the image scanner 1.
0 is read and stored in the multivalued image memory 11 as multivalued image data. After that, the multi-valued image data in the multi-valued image memory 11 is converted into three binary image data by the binarization unit 12 using three kinds of threshold values, and the first image memory 14 and the second image memory are respectively converted. 15 and the third image memory 16 are stored.

【００５３】ここで、操作者が領域指定部１８により図
６の（ａ）の領域１を認識対象領域として指定したもの
とする。図６の（ｂ）に示す文字パターンが領域１の実
際の文字パターンであるとする。文字切出し部１９によ
り領域１に含まれる文字パターンが切り出された後、特
徴抽出部２０により切り出された文字パターンのうち１
０文字分に関して文字の特徴が抽出され、文字認識部２
１において文字認識処理が行なわれる。Here, it is assumed that the operator has designated the area 1 in FIG. 6A as the recognition target area by the area designating unit 18. It is assumed that the character pattern shown in FIG. 6B is the actual character pattern of the area 1. One of the character patterns cut out by the feature extraction unit 20 after the character pattern included in the area 1 is cut out by the character cutting unit 19
Character features are extracted for 0 characters, and the character recognition unit 2
In 1, the character recognition processing is performed.

【００５４】図６の（ｃ）に、認識候補文字が第１文字
コード領域１０６、第２文字コード領域１０７および第
３文字コード領域１０８に格納された状態を示す。同図
で、「記憶場所の内容」の欄において矩形で囲まれた文
字は、最大類似度または類似度差がそれぞれ所定のしき
い値よりも高かったために確定できた文字である。矩形
で囲まれていない文字は確定できなかった文字である。FIG. 6C shows a state in which the recognition candidate characters are stored in the first character code area 106, the second character code area 107 and the third character code area 108. In the figure, the characters surrounded by a rectangle in the "contents of storage location" column are the characters that can be confirmed because the maximum similarity or the difference in similarity is higher than a predetermined threshold value. Characters that are not enclosed in a rectangle are characters that could not be determined.

【００５５】この場合、確定文字数および未確定文字数
から求めた認識精度は図６の（ｄ）に示すようになる。
この例では、第２画像メモリ１５で文字認識した場合の
認識精度が最高であることがわかる。したがって、以降
の処理では、図６の（ｂ）に示す１１番目以降の文字パ
ターンに関しても第２画像メモリ１５内の２値画像デー
タに対して文字認識処理が行なわれる。また、領域１以
外の他の領域についても、第２画像メモリ１５内の２値
画像データに対して文字認識処理が行なわれる。In this case, the recognition accuracy obtained from the number of confirmed characters and the number of unconfirmed characters is as shown in FIG. 6 (d).
In this example, it can be seen that the recognition accuracy is highest when characters are recognized by the second image memory 15. Therefore, in the subsequent processing, the character recognition processing is performed on the binary image data in the second image memory 15 even for the 11th and subsequent character patterns shown in FIG. 6B. In addition, the character recognition processing is also performed on the binary image data in the second image memory 15 in areas other than the area 1.

【００５６】このように、第１の実施例の文字認識装置
によれば、複数のしきい値を用いて多値画像データが複
数の２値画像データに変換され、認識精度が高い２値画
像データを用いて文字認識処理が行なわれるので、原稿
を１回読み取るだけで高い精度で文字認識処理が行なわ
れる。As described above, according to the character recognition apparatus of the first embodiment, multi-valued image data is converted into a plurality of binary image data using a plurality of threshold values, and a binary image with high recognition accuracy is obtained. Since the character recognition processing is performed using the data, the character recognition processing is performed with high accuracy by reading the original once.

【００５７】なお、上記実施例では、３つのしきい値で
２値化を行なっているが、２以上のしきい値であればい
くつのしきい値を用いてもよい。In the above embodiment, the binarization is performed with three thresholds, but any number of thresholds may be used as long as it is two or more.

【００５８】図７は、本発明の第２の実施例における文
字認識装置の機能構成を示すブロック図である。FIG. 7 is a block diagram showing the functional arrangement of a character recognition device according to the second embodiment of the present invention.

【００５９】図７の文字認識装置が図１の文字認識装置
と異なるのは、多値画像メモリ１１および画像選択部１
７が設けられていない点、３つの画像メモリからなる２
値画像メモリ１３の代わりに１つの画像メモリからなる
２値画像メモリ１が設けられている点、および判定部２
５の処理が異なる点である。判定部２５が制御手段を構
成する。The character recognition device of FIG. 7 differs from the character recognition device of FIG. 1 in that the multivalued image memory 11 and the image selection unit 1 are used.
7 is not provided, and it is composed of 3 image memories 2
A point that a binary image memory 1 composed of one image memory is provided in place of the value image memory 13, and the determination unit 2
5 is the difference. The determination unit 25 constitutes a control means.

【００６０】まず、イメージスキャナ１０により原稿が
多値画像データとして読み取られる。２値化部１２は、
イメージスキャナ１０から出力された多値画像データを
所定のしきい値で２値化して２値画像データを２値画像
メモリ１に記憶させる。領域指定部１８、文字切出し部
１９、特徴抽出部２０、文字認識部２１および認識精度
計算部２３の動作は第１の実施例と同様である。認識精
度計算部２３により２値画像メモリ１内の２値画像デー
タに関する文字の認識精度が計算され、記憶処理部２４
によりランダムアクセスメモリに記憶される。First, the original is read by the image scanner 10 as multivalued image data. The binarization unit 12
The multi-valued image data output from the image scanner 10 is binarized with a predetermined threshold value, and the binary image data is stored in the binary image memory 1. The operations of the area designation unit 18, the character cutout unit 19, the feature extraction unit 20, the character recognition unit 21, and the recognition accuracy calculation unit 23 are the same as those in the first embodiment. The recognition accuracy calculation unit 23 calculates the character recognition accuracy of the binary image data in the binary image memory 1, and the storage processing unit 24
Are stored in the random access memory.

【００６１】判定部２５はこの認識精度が所定の値より
も高いか否かを判定する。認識精度が所定の値よりも低
い場合には、判定部２５は、イメージスキャナ１０によ
り再び原稿を多値画像データとして読み取らせる。２値
化部１２は、イメージスキャナ１０から出力された多値
画像データを前回とは異なるしきい値で２値化して２値
画像データを２値画像メモリ１に記憶させる。その後、
前回と同様の処理が行なわれ、認識精度計算部２３によ
り文字の認識精度が計算される。認識精度が所定の値よ
りも高くなるまでまたは２値化処理が所定の回数を越え
るまで、同様の処理が繰り返される。The judging section 25 judges whether or not the recognition accuracy is higher than a predetermined value. When the recognition accuracy is lower than the predetermined value, the determination unit 25 causes the image scanner 10 to read the original again as multivalued image data. The binarization unit 12 binarizes the multi-valued image data output from the image scanner 10 with a threshold value different from the previous threshold value and stores the binary image data in the binary image memory 1. afterwards,
The same processing as the previous time is performed, and the recognition accuracy calculation unit 23 calculates the character recognition accuracy. Similar processing is repeated until the recognition accuracy becomes higher than a predetermined value or the binarization processing exceeds a predetermined number of times.

【００６２】このように、第２の実施例の文字認識装置
によれば、自動的に原稿の読み取りおよびしきい値の変
更を行ないつつ２値化処理が繰り返されるので、第１の
実施例の文字認識装置と同様に、高い精度で文字認識処
理が行なわれる。As described above, according to the character recognition apparatus of the second embodiment, the binarization process is repeated while automatically reading the original and changing the threshold value. As with the character recognition device, character recognition processing is performed with high accuracy.

【００６３】図８は、本発明の第３の実施例の文字認識
装置の機能構成を示すブロック図である。FIG. 8 is a block diagram showing the functional arrangement of a character recognition device according to the third embodiment of the present invention.

【００６４】図８の文字認識装置が図１の文字認識装置
と異なるのは、２値化しきい値決定部３０が新たに設け
られている点、３つの画像メモリからなる２値画像メモ
リ１３の代わりに１つの画像メモリからなる２値画像メ
モリ１が設けられている点、画像選択部１７が設けられ
ていない点、および判定部２５の処理が異なる点であ
る。判定部２５が制御手段を構成する。The character recognition device of FIG. 8 differs from the character recognition device of FIG. 1 in that a binarization threshold value determination unit 30 is newly provided in the binary image memory 13 including three image memories. Instead, the binary image memory 1 including one image memory is provided, the image selection unit 17 is not provided, and the processing of the determination unit 25 is different. The determination unit 25 constitutes a control means.

【００６５】まず、イメージスキャナ１０により原稿が
多値画像データとして読み取られ、その多値画像データ
は多値画像メモリ１１に記憶される。２値化しきい値決
定部３０は、２値化のためのしきい値を決定する。２値
化部１２は、決定されたしきい値を用いて多値画像メモ
リ１１内の多値画像データを２値化し、２値画像データ
を２値画像メモリ１に記憶させる。領域指定部１８、文
字切出し部１９、特徴抽出部２０、文字認識部２１およ
び認識精度計算部２３の動作は第１の実施例と同様であ
る。認識精度計算部２３により２値画像メモリ１内の２
値画像データに関する文字の認識精度が計算され、記憶
処理部２４によりランダムアクセスメモリに記憶され
る。First, the document is read as multi-valued image data by the image scanner 10, and the multi-valued image data is stored in the multi-valued image memory 11. The binarization threshold value determination unit 30 determines a threshold value for binarization. The binarization unit 12 binarizes the multivalued image data in the multivalued image memory 11 using the determined threshold value and stores the binary image data in the binary image memory 1. The operations of the area designation unit 18, the character cutout unit 19, the feature extraction unit 20, the character recognition unit 21, and the recognition accuracy calculation unit 23 are the same as those in the first embodiment. The recognition accuracy calculation unit 23 causes the binary image memory 1
The recognition accuracy of the character relating to the value image data is calculated and stored in the random access memory by the storage processing unit 24.

【００６６】判定部２５はこの認識精度が所定の値より
も高いか否かを判定する。認識精度が所定の値よりも低
いときには、判定部２５は、２値化しきい値決定部３０
にしきい値を変更させる。２値化部１２は、多値画像メ
モリ１１内の多値画像データを変更されたしきい値で２
値化して２値画像データを２値画像メモリ１に記憶させ
る。その後、前回と同様の処理が行なわれ、認識精度計
算部２３により文字の認識精度が計算される。認識精度
が所定の値よりも高くなるまでまたは２値化処理が所定
の回数を越えるまで、同様の処理が繰り返される。The judging section 25 judges whether or not the recognition accuracy is higher than a predetermined value. When the recognition accuracy is lower than the predetermined value, the determination unit 25 determines the binarization threshold value determination unit 30.
To change the threshold. The binarization unit 12 sets the multi-valued image data in the multi-valued image memory 11 to 2 with the changed threshold value.
The binary image data is binarized and stored in the binary image memory 1. Thereafter, the same processing as the previous time is performed, and the recognition accuracy calculation unit 23 calculates the character recognition accuracy. Similar processing is repeated until the recognition accuracy becomes higher than a predetermined value or the binarization processing exceeds a predetermined number of times.

【００６７】このように、第３の実施例の文字認識装置
によれば、自動的にしきい値の変更を行ないつつ２値化
処理が繰り返されるので、第１の実施例の文字認識装置
と同様に、高い精度で文字認識処理が行なわれる。As described above, according to the character recognition apparatus of the third embodiment, since the binarization processing is repeated while automatically changing the threshold value, it is the same as the character recognition apparatus of the first embodiment. In addition, character recognition processing is performed with high accuracy.

【００６８】[0068]

【発明の効果】以上のように第１、第２および第３の発
明によれば、自動的に認識精度の高い２値画像データが
求められるので、操作者を煩わすことなく、文書画像の
文字を高い精度で認識することができる文字認識装置が
提供される。As described above, according to the first, second and third inventions, since the binary image data with high recognition accuracy is automatically obtained, the characters of the document image can be displayed without bothering the operator. There is provided a character recognition device capable of recognizing characters with high accuracy.

【図面の簡単な説明】[Brief description of drawings]

【図１】本発明の第１の実施例における文字認識装置の
機能構成を示すブロック図FIG. 1 is a block diagram showing a functional configuration of a character recognition device according to a first embodiment of the present invention.

【図２】本発明の第１の実施例における文字認識装置の
ハードウエア構成を示すブロック図FIG. 2 is a block diagram showing a hardware configuration of a character recognition device according to the first embodiment of the present invention.

【図３】本発明の第１の実施例における文字認識装置の
動作のフローチャートFIG. 3 is a flowchart of the operation of the character recognition device in the first embodiment of the present invention.

【図４】図１における画像選択部の処理手順を示すフロ
ーチャートFIG. 4 is a flowchart showing a processing procedure of an image selection unit in FIG.

【図５】図１における記憶処理部の処理手順を示すフロ
ーチャート5 is a flowchart showing a processing procedure of a storage processing unit in FIG.

【図６】本発明の第１の実施例における文字認識装置の
具体的な動作を説明するための図FIG. 6 is a diagram for explaining a specific operation of the character recognition device according to the first embodiment of the present invention.

【図７】本発明の第２の実施例における文字認識装置の
機能構成を示すブロック図FIG. 7 is a block diagram showing a functional configuration of a character recognition device according to a second embodiment of the present invention.

【図８】本発明の第３の実施例における文字認識装置の
機能構成を示すブロック図FIG. 8 is a block diagram showing a functional configuration of a character recognition device according to a third embodiment of the present invention.

【図９】従来の文字認識装置の機能構成を示すブロック
図FIG. 9 is a block diagram showing a functional configuration of a conventional character recognition device.

[Explanation of symbols]

１０，１００イメージスキャナ１１多値画像メモリ１２２値化部１３２値画像メモリ１４第１画像メモリ１５第２画像メモリ１６第３画像メモリ１７画像選択部１８領域指定部１９文字切出し部２０特徴抽出部２１文字認識部２２認識辞書２３認識精度計算部２４記憶処理部２５判定部２６表示部３０２値化しきい値決定部１０１キーボード１０２リードオンリメモリ１０３プログラム記憶領域１０４しきい値記憶領域１０５ランダムアクセスメモリ１０６第１文字コード領域１０７第２文字コード領域１０８第３文字コード領域１０９最終文字コード領域１１０画像メモリ指示フラグ１１１作業領域１１２第１認識精度領域１１３第２認識精度領域１１４第３認識精度領域 10, 100 image scanner 11 multi-valued image memory 12 binarization unit 13 binary image memory 14 first image memory 15 second image memory 16 third image memory 17 image selection unit 18 area designation unit 19 character cutout unit 20 feature extraction Part 21 Character recognition part 22 Recognition dictionary 23 Recognition accuracy calculation part 24 Storage processing part 25 Judgment part 26 Display part 30 Binarization threshold value determination part 101 Keyboard 102 Read only memory 103 Program storage area 104 Threshold storage area 105 Random access Memory 106 First character code area 107 Second character code area 108 Third character code area 109 Final character code area 110 Image memory instruction flag 111 Work area 112 First recognition accuracy area 113 Second recognition accuracy area 114 Third recognition accuracy area

Claims

[Claims]

1. A first storage unit for storing a read document image as multivalued image data, and binarizing the multivalued image data stored in the first storage unit with a plurality of different threshold values. Binarizing means for obtaining a plurality of binary image data, a second storing means for storing the plurality of binary image data obtained by the binarizing means, and a second storing means The recognition accuracy calculation means for calculating the recognition accuracy of a character with respect to a plurality of binary image data and the plurality of recognition accuracies obtained by the recognition accuracy calculation means are compared with each other, and a plurality of two recognition values stored in the second storage means are compared. A character recognition device, comprising: selection means for selecting the binary image data having the highest recognition accuracy among the value image data.

2. A reading means for reading a document image as multivalued image data, and a binarizing means for binarizing the multivalued image data read by said reading means with a predetermined threshold value to obtain binary image data. A storage unit for storing the binary image data obtained by the binarization unit; a recognition accuracy calculation unit for calculating character recognition accuracy for the binary image data stored in the storage unit; The recognition accuracy obtained by the calculation means is compared with a predetermined value, and when the recognition accuracy is lower than the predetermined value, the reading means is made to read the document image again and the binarization means is caused to perform the predetermined operation. A character recognition device, comprising: a control unit that changes a threshold value to perform binarization.

3. A first storage means for storing the read document image as multi-valued image data, a threshold value determination means for determining a threshold value for binarization, and the first storage means. Binarizing means for binarizing the multivalued image data stored in the binarized image by the threshold value determined by the threshold value determining means to obtain binary image data; and the binary value obtained by the binarizing means. Second storage means for storing image data, recognition accuracy calculation means for calculating character recognition accuracy for binary image data stored in the second storage means, and recognition obtained by the recognition accuracy calculation means The accuracy is compared with a predetermined value, and when the recognition accuracy is lower than the predetermined value, the threshold value determining means changes the threshold value and the binarizing means changes the binary value with the threshold value. And a control means for performing Character recognition device.