JPH05298489A

JPH05298489A - System for recognizing character

Info

Publication number: JPH05298489A
Application number: JP4099716A
Authority: JP
Inventors: Toshio Tsutsumida; 敏夫堤田; Kyoichi Sumiya; 恭一角谷; Yumi Nakayama; 由美中山
Original assignee: N T T DATA TSUSHIN KK; NTT Data Communications Systems Corp
Current assignee: N T T DATA TSUSHIN KK; NTT Data Corp
Priority date: 1992-04-20
Filing date: 1992-04-20
Publication date: 1993-11-12

Abstract

PURPOSE:To improve recognition precision by compensating a collation value at every character category and outputting the character category candidate of a character pattern which is read from a recognition object range. CONSTITUTION:Character category frequency memory parts 7-1 and 7-2 are respectively provided in accordance with the respective recognition object ranges when plural recognition object ranges exist in a slip. Then, a memory selecting part 8 selects one character category frequency memory part corresponding to the present recognition object range. Moreover, a collation value compensating part 9 compensates the collation value at every character category, which is obtained by an identification collating part 3, based on frequency information at every character category. The collation value compensated by the collation value compensating part 9 is inputted to a candidate character category sorting part 6. When the collation value by the classification of the character category is inputted, the collation value compensating part 9 controls the memory selecting part, permits the memory selecting part to select the character category frequency memory part 7-1 corresponding to the present recognition object range and reads frequency information by the classification of the character category.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、ＯＣＲ等に適用して好
適な文字認識方式に係り、特に対象となる文字カテゴリ
を認識過程で少数に絞り込み、誤認識確率を向上させる
ようにした文字認識方式に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character recognition method suitable for application to OCR and the like, and in particular, character recognition for improving the misrecognition probability by narrowing the target character category to a small number in the recognition process. Regarding the scheme.

【０００２】[0002]

【従来の技術】図４は従来の文字認識方法を用いた文字
認識装置の主要部の構成を示すブロック図である。同図
において、１は帳票上の認識対象範囲に記述された文字
列をスキャナまたは文字切り出し等の手段によって文字
単位に２値量子化してなる文字パターンの特徴を解析
し、識別用の特徴ベクトルを作成する特徴抽出部、２は
認識対象の多数の文字パターンについてその標準的な特
徴を表わす標準的特徴ベクトルを照合情報として文字カ
テゴリ毎に格納した辞書メモリ部、３はこの辞書メモリ
部２に格納した照合情報を文字カテゴリ単位に順次読出
し、前記特徴ベクトルと照合し、照合情報に対する特徴
ベクトルの類似度を表わす照合値を文字カテゴリ単位に
出力する識別照合部、４−１および４−２は識別照合部
３が照合処理に使用する照合情報に関し、その使用可否
（可＝１，否＝０）の情報を文字カテゴリ別に格納した
文字カテゴリ限定メモリ部であり、この文字カテゴリ限
定メモリ部４−１，４−２は帳票上に複数個所の認識対
象範囲が存在する場合、各認識対象範囲にそれぞれ対応
して設けられる。但し、共通する文字カテゴリの認識対
象範囲については兼用するようになっている。なお、図
においては、２つの認識対象範囲に対応するものについ
てのみ図示し、他の認識対象範囲の文字カテゴリ限定メ
モリ部については省略している。図５に文字カテゴリが
３０００種である場合の格納内容の一例を示している。2. Description of the Related Art FIG. 4 is a block diagram showing a configuration of a main part of a character recognition apparatus using a conventional character recognition method. In the figure, reference numeral 1 is a character pattern which is obtained by binary-quantizing a character string described in a recognition target range on a form in a character unit by means of a scanner or a character cutout, and identifies a feature vector for identification. A feature extraction unit 2 to be created is a dictionary memory unit that stores a standard feature vector representing a standard feature of a large number of character patterns to be recognized as collation information for each character category, and 3 is stored in the dictionary memory unit 2. The identification collating units 4-1 and 4-2 identify the collation information sequentially read in character category units, collate with the feature vector, and output a collation value representing the similarity of the feature vector to the collation information in character category units. Regarding the matching information used by the matching unit 3 in the matching process, a character category limit in which information of availability (possible = 1, not = 0) is stored for each character category. A memory unit, the character category limited memory unit 4-1 and 4-2 when the recognition target range of a plurality of locations existing on the form, provided corresponding to each recognition target range. However, the recognition target range of the common character category is also used. In the figure, only those corresponding to the two recognition target ranges are shown, and the character category limited memory units in the other recognition target ranges are omitted. FIG. 5 shows an example of the stored contents when the character category is 3000 types.

【０００３】５は文字カテゴリ限定メモリ部４−１，４
−２のうち現在の認識対象範囲に対応したメモリ部を選
択し、その選択したメモリ部から文字カテゴリ単位に順
次読み出された使用可否の情報を識別照合部３に入力
し、識別照合部３に使用可否の情報が「１」となってい
る文字カテゴリに限って照合動作を行わせるメモリ選択
部、６は識別照合部３における照合動作によって得た全
ての文字カテゴリの照合値をソートし、現在の認識対象
範囲から読み取った文字パターンの文字カテゴリ候補を
出力する候補文字カテゴリソート部であり、この候補文
字カテゴリソート部６から出力される文字カテゴリ候補
は図示しない単語照合部や認識結果表示部等に入力さ
れ、ここで存在し得る単語文字列に合致した候補文字カ
テゴリ列の選択、文字読み取り装置操作者による修正等
の実現に供せられる。Reference numeral 5 is a character category limited memory section 4-1, 4
-2, a memory unit corresponding to the current recognition target range is selected, and the usability information sequentially read in character category units from the selected memory unit is input to the identification collation unit 3 and the identification collation unit 3 In the memory selection unit for performing the matching operation only in the character category whose availability information is “1”, 6 sorts the matching values of all the character categories obtained by the matching operation in the identification matching unit 3, The candidate character category sorting unit outputs the character category candidates of the character pattern read from the current recognition target range. The character category candidates output from the candidate character category sorting unit 6 are a word matching unit and a recognition result display unit (not shown). Etc., and is used for realizing selection of a candidate character category string that matches a word character string that may exist and correction by the operator of the character reading device.

【０００４】このような構成において、帳票上の認識対
象範囲に記述された文字列を認識する場合、その文字列
をスキャナまたは文字切り出し等の手段によって文字単
位に２値量子化し、文字パターンとして特徴抽出部１に
入力する。In such a configuration, when recognizing a character string described in a recognition target range on a form, the character string is binary-quantized in character units by means such as a scanner or a character cutout and characterized as a character pattern. Input to the extraction unit 1.

【０００５】すると、特徴抽出部１は入力された文字パ
ターンの特徴を解析し、識別用の特徴ベクトルを作成す
る。この特徴ベクトルは識別照合部３に入力される。Then, the feature extraction unit 1 analyzes the features of the input character pattern and creates a feature vector for identification. This feature vector is input to the identification and collation unit 3.

【０００６】識別照合部３は、文字パターンの特徴ベク
トルが入力されると、辞書メモリ部２から文字カテゴリ
単位に照合情報を順次読み出す。また、メモリ選択部５
を制御し、メモリ選択部５に現在の認識対象範囲に応じ
た文字カテゴリ限定メモリ部（例えば４−１）を選択さ
せ、この文字カテゴリ限定メモリ部４−１に格納した使
用可否の情報のうち辞書メモリ部２から読出した照合情
報の文字カテゴリに対応した使用可否の情報を同期して
読出す。そして、使用可否の情報が「１」となっている
文字カテゴリに限って、特徴抽出部１から入力された特
徴ベクトルと辞書メモリ部２から読出した照合情報とを
照合し、その照合情報に対する特徴ベクトルの類似度を
表わす照合値を出力する。When the feature vector of the character pattern is input, the identification collation unit 3 sequentially reads out the collation information from the dictionary memory unit 2 for each character category. In addition, the memory selection unit 5
Of the information on availability of use stored in the character category limited memory unit 4-1 by controlling the memory selection unit 5 to select the character category limited memory unit (for example, 4-1) according to the current recognition target range. Information about availability of use corresponding to the character category of the collation information read from the dictionary memory unit 2 is synchronously read. Then, the feature vector input from the feature extraction unit 1 is collated with the collation information read from the dictionary memory unit 2 only in the character category for which the usability information is “1”, and the feature for that collation information is collated. A matching value indicating the similarity of the vector is output.

【０００７】例えば、平仮名に限って使用可否の情報が
「１」となっている場合は、平仮名の照合情報のみの照
合が行われ、平仮名の各文字に対する照合値が出力され
る。[0007] For example, when the availability information is "1" only for hiragana, only the collation information of hiragana is collated, and the collation value for each character of hiragana is output.

【０００８】この照合値は候補文字カテゴリソート部６
に入力される。候補文字カテゴリソート部６は識別照合
部３における照合動作によって得た全ての文字カテゴリ
の照合値をソートし、現在の認識対象範囲から読み取っ
た文字パターンの文字カテゴリ候補を出力する。This matching value is used as the candidate character category sorting unit 6
Entered in. The candidate character category sorting unit 6 sorts the matching values of all the character categories obtained by the matching operation in the identification matching unit 3, and outputs the character category candidates of the character pattern read from the current recognition target range.

【０００９】このような構成となっているため、認識対
象範囲に記述される文字カテゴリが例えば平仮名のみに
限定されるような場合、平仮名に限って使用可否の情報
を「１」としておけば、当該認識対象範囲の文字列を認
識する際に、平仮名以外の文字カテゴリを除いた照合が
可能になり、誤読確率を低減することができる。With such a configuration, when the character category described in the recognition target range is limited to, for example, only hiragana, if the information of availability is set to "1" only in hiragana, When recognizing a character string in the recognition target range, it is possible to perform collation excluding character categories other than hiragana, and the misreading probability can be reduced.

【００１０】[0010]

【発明が解決しようとする課題】しかしながら、上記従
来技術にあっては、文字カテゴリの使用可否を指定可能
であるが、漢字を認識する場合のように、認識対象とな
る文字カテゴリが多数存在する場合には、文字カテゴリ
の範囲絞り込みの程度が小さいため、誤読確率の低減効
果は大きくならないという問題があった。However, in the above-mentioned conventional technique, it is possible to specify whether or not to use character categories, but there are many character categories to be recognized, such as when recognizing Chinese characters. In this case, since the degree of narrowing down the range of the character category is small, there is a problem in that the effect of reducing the misreading probability does not increase.

【００１１】本発明はこのような問題を解決すべくなさ
れたもので、認識対象となる文字カテゴリが多数存在す
る場合でも文字カテゴリの範囲絞り込みの程度を大きく
し、実効的な認識精度を向上させることができる文字認
識方式を提供することを目的とするものである。The present invention has been made to solve such a problem. Even when there are many character categories to be recognized, the range of character categories is narrowed down to improve the effective recognition accuracy. The object of the present invention is to provide a character recognition method capable of performing the character recognition.

【００１２】[0012]

【課題を解決するための手段】本発明は上記目的を達成
するために、認識対象範囲から読み取った文字パターン
の特徴を解析し、識別用の特徴ベクトルを作成する特徴
抽出手段と、認識対象の多数の文字パターンについてそ
の標準的特徴を表わす照合情報を文字カテゴリ毎に格納
した辞書メモリ手段と、この辞書メモリ手段に格納した
照合情報を文字カテゴリ単位に順次読出し、前記特徴ベ
クトルと照合し、照合情報に対する特徴ベクトルの類似
度を表わす照合値を文字カテゴリ単位に出力する識別照
合手段と、全ての文字カテゴリにおいて得た照合値をソ
ートし、認識対象範囲から読み取った文字パターンの文
字カテゴリ候補を出力する候補文字カテゴリソート手段
とを備えた文字認識方式において、前記認識対象範囲内
で出現する文字カテゴリ毎の出現確率に関する頻度情報
を格納した文字カテゴリ頻度メモリ手段と、前記識別照
合手段で得た文字カテゴリ毎の照合値を前記文字カテゴ
リ毎の頻度情報に基づき補正する照合値補正手段とを設
け、この照合値補正手段で補正した照合値を前記候補文
字カテゴリソート手段へ入力し、認識対象範囲から読み
取った文字パターンの文字カテゴリ候補を出力するよう
にした。In order to achieve the above object, the present invention analyzes a feature of a character pattern read from a recognition target range and creates a feature vector for identification, and a feature extraction unit of the recognition target. A dictionary memory means storing collation information representing standard characteristics of a large number of character patterns for each character category, and collation information stored in this dictionary memory means is sequentially read out in character category units and collated with the characteristic vector to collate. Identification matching means that outputs a matching value indicating the similarity of the feature vector to information in character category units, and the matching values obtained in all character categories are sorted, and character category candidates of the character pattern read from the recognition target range are output. In the character recognition method including means for sorting candidate character categories, Character category frequency memory means for storing frequency information regarding appearance probability for each gori, and matching value correction means for correcting the matching value for each character category obtained by the identification and matching means based on the frequency information for each character category are provided. The matching value corrected by the matching value correcting means is input to the candidate character category sorting means, and the character category candidates of the character pattern read from the recognition target range are output.

【００１３】[0013]

【作用】上記手段によれば、認識対象範囲内で出現する
文字カテゴリ毎の出現確率に関する頻度情報により、文
字カテゴリ毎の照合値を補正し、この補正した照合値を
候補文字カテゴリソート手段へ入力し、認識対象範囲か
ら読み取った文字パターンの文字カテゴリ候補を出力す
るので、出現確率の高い文字カテゴリ候補を優先させ、
絞り込み程度を大きくすることができる。この結果、実
効的な認識精度を向上させることができる。According to the above means, the matching value for each character category is corrected based on the frequency information regarding the appearance probability of each character category that appears in the recognition target range, and the corrected matching value is input to the candidate character category sorting means. Then, since the character category candidates of the character pattern read from the recognition target range are output, the character category candidates having a high appearance probability are prioritized,
The degree of narrowing can be increased. As a result, effective recognition accuracy can be improved.

【００１４】[0014]

【実施例】以下、本発明を図示する実施例によって詳細
に説明する。EXAMPLES The present invention will be described in detail below with reference to illustrated examples.

【００１５】図１は本発明の文字認識方式を用いた文字
認識装置の主要部の構成の一実施例を示すブロック図で
ある。同図において、図４と同一部分は同一記号で示
し、その説明は省略する。図１において、７−１，７−
２は認識対象範囲内で出現する文字カテゴリ毎の出現確
率に関する頻度情報を格納した文字カテゴリ頻度メモリ
部であり、この文字カテゴリ頻度メモリ部７−１，７−
２は帳票上に複数個所の認識対象範囲が存在する場合、
各認識対象範囲にそれぞれ対応して設けられる。但し、
共通する文字カテゴリの認識対象範囲については兼用す
るようになっている。なお、図においては、２つの認識
対象範囲に対応するものについてのみ図示し、他の認識
対象範囲の文字カテゴリ頻度メモリ部については省略し
ている。FIG. 1 is a block diagram showing an embodiment of the configuration of the main part of a character recognition device using the character recognition system of the present invention. In the figure, the same parts as those in FIG. 4 are indicated by the same symbols, and the description thereof will be omitted. In FIG. 1, 7-1, 7-
Reference numeral 2 denotes a character category frequency memory unit that stores frequency information regarding the appearance probability of each character category that appears in the recognition target range. The character category frequency memory units 7-1 and 7-
2 is when there are multiple recognition target areas on the form,
It is provided corresponding to each recognition target range. However,
The recognition target range of the common character category is shared. In the figure, only those corresponding to the two recognition target ranges are shown, and the character category frequency memory units of the other recognition target ranges are omitted.

【００１６】８は現在の認識対象範囲に応じた１つの文
字カテゴリ頻度メモリ部を選択するメモリ選択部、９は
識別照合部３で得た文字カテゴリ毎の照合値を前記文字
カテゴリ毎の頻度情報に基づき補正する照合値補正部で
あり、この照合値補正部９で補正した照合値は候補文字
カテゴリソート部６に入力される。Reference numeral 8 is a memory selection unit for selecting one character category frequency memory unit according to the current recognition target range. Reference numeral 9 is a matching value for each character category obtained by the identification and matching unit 3 and frequency information for each character category. Is a collation value correction unit that corrects the collation value based on the collation value.

【００１７】このような構成において、認識対象の文字
列を記述した帳票１０が図２に示すような構成であり、
この帳票１０の中の形式コード欄１１、地名欄１２、番
地欄１３、姓名欄１４に記述される文字列を読み取って
認識する場合を仮定する。In such a structure, the form 10 in which the character string to be recognized is described has a structure as shown in FIG.
It is assumed that the form code column 11, the place name column 12, the address column 13, and the family name column 14 in the form 10 are read and recognized.

【００１８】このうち、番地欄１３の文字列「６６番地
２」の「２」を認識する場合、識別照合部３において
は、図３（ａ）に示すような照合値が得られる。すなわ
ち、「２」に類似した文字として「Ｚ」や「乙」という
文字があるが、数字「２」、英字「Ｚ」、漢字「乙」の
標準的特徴ベクトルに対し、番地欄１３から読み取った
数字「２」の類似度を表わす照合値が例えば「２」＝２
０，「Ｚ」＝１５，「乙」＝１０といったような数値で
得られる。ここで、この数値が小さいほど標準的特徴ベ
クトルに近いことを示している。When recognizing "2" of the character string "66 address 2" in the address column 13 among these, the identification matching unit 3 obtains a matching value as shown in FIG. 3 (a). That is, although there are characters similar to "2" such as "Z" and "Otsu", the standard feature vector of the number "2", the alphabet "Z", and the Chinese character "Otsu" can be read from the address column 13. The matching value representing the degree of similarity of the numeral "2" is, for example, "2" = 2.
It is obtained with numerical values such as 0, “Z” = 15, and “B” = 10. Here, it is shown that the smaller this numerical value is, the closer it is to the standard feature vector.

【００１９】このような数値表現された照合値は照合値
補正部９に入力される。照合値補正部９は、文字カテゴ
リ別の照合値が入力されると、メモリ選択部５を制御
し、メモリ選択部５に現在の認識対象範囲である番地欄
１３に応じた文字カテゴリ頻度メモリ部（例えば７−
１）を選択させ、この文字カテゴリ頻度メモリ部７−１
に格納した文字カテゴリ別の頻度情報を読出す。The collation value expressed as such a numerical value is input to the collation value correction unit 9. When the collation value for each character category is input, the collation value correction unit 9 controls the memory selection unit 5 so that the memory selection unit 5 stores the character category frequency memory unit according to the address column 13 which is the current recognition target range. (For example, 7-
1) is selected, and the character category frequency memory unit 7-1
The frequency information for each character category stored in is read.

【００２０】例えば、図３（ｂ）に示すように、数字
「２」、英字「Ｚ」、漢字「乙」の文字カテゴリに対し
て、「２」＝１．０，「Ｚ」＝０．２，「乙」＝０．１
といったように数値表現された頻度情報を読み出す。こ
こで、数値が大きいほど出現確率が大きいことを示して
いる。For example, as shown in FIG. 3B, for the character categories of the numeral "2", the alphabet "Z", and the Chinese character "Otsu", "2" = 1.0, "Z" = 0. 2, "Oto" = 0.1
The frequency information expressed numerically is read out. Here, it is indicated that the larger the numerical value, the higher the appearance probability.

【００２１】このような頻度情報を読出したならば、例
えば「照合値Ａ÷頻度情報Ｂ＝補正照合値Ｃ」という補
正演算式を用い、照合値Ａを頻度情報Ｂによって補正す
る。When such frequency information is read, the matching value A is corrected by the frequency information B using a correction arithmetic expression, for example, "matching value A / frequency information B = corrected matching value C".

【００２２】この結果、図３（ｃ）に示すような補正照
合値Ｃが得られる。すなわち、識別照合部３から出力さ
れる照合値によれば、番地欄１３から読み取った数字
「２」は漢字「乙」に類似していることを表わしている
が、番地欄１３には漢字「乙」が出現する確率は小さい
筈であるので、この出現確率を表わす頻度情報Ｂによっ
て補正されて候補順位が下げられ、これに代えて数字
「２」の候補順位が繰り上げられる。As a result, the corrected collation value C as shown in FIG. 3C is obtained. That is, according to the collation value output from the identification and collation unit 3, the number “2” read from the address column 13 is similar to the Chinese character “Otsu”, but the address column 13 has the Chinese character “ Since the probability that "B" appears will be small, it is corrected by the frequency information B representing the appearance probability and the candidate rank is lowered, and instead, the candidate rank of the number "2" is advanced.

【００２３】従って、この例では数字「２」、英字
「Ｚ」、漢字「乙」の順位で文字カテゴリ候補が出力さ
れる。Therefore, in this example, character category candidates are output in the order of the number "2", the alphabet "Z", and the Chinese character "Otsu".

【００２４】従って、図２のような帳票１０にあって
は、地名欄１２に対応する頻度情報は数字に関する頻度
情報を小さくし、漢字に関する頻度情報を大きくすれ
ば、候補文字のカテゴリ範囲を小さくすることができ
る。同様に、姓名欄１４にあっては、「県」、「市」、
「町」といった漢字に関する頻度情報を小さくし、これ
に代えて姓名に多く使用される漢字「夫」、「子」等の
漢字に関する頻度情報を大きくすることにより、候補文
字のカテゴリ範囲を小さくすることができる。Therefore, in the form 10 as shown in FIG. 2, if the frequency information corresponding to the place name column 12 is smaller in frequency information regarding numbers and is larger in frequency information regarding kanji, the category range of candidate characters is reduced. can do. Similarly, in the family name field 14, "prefecture", "city",
By reducing the frequency information about Kanji such as "town" and increasing the frequency information about Kanji such as "husband" and "child" that are often used for surnames, the category range of candidate characters is reduced. be able to.

【００２５】[0025]

【発明の効果】以上説明したように本発明によれば、認
識対象範囲内で出現する文字カテゴリ毎の出現確率に関
する頻度情報により、文字カテゴリ毎の照合値を補正
し、この補正した照合値を候補文字カテゴリソート手段
へ入力するので、字形が類似していることが原因で他の
文字カテゴリより候補順位が低くなった場合であって
も、この順位を出現確率に応じて補正し、優先順位の高
い文字カテゴリ候補として出力することができる。この
結果、文字カテゴリ候補の絞り込み程度が大きくなり、
実効的な認識精度を向上させることができるといった効
果がある。As described above, according to the present invention, the matching value for each character category is corrected based on the frequency information regarding the appearance probability of each character category that appears in the recognition target range, and the corrected matching value is calculated. Since the candidate character category is input to the sorting means, even if the candidate rank is lower than other character categories due to similar glyphs, this rank is corrected according to the appearance probability, and the priority rank is set. It can be output as a character category candidate having a high character. As a result, the degree of narrowing down the character category candidates will increase,
There is an effect that the effective recognition accuracy can be improved.

[Brief description of drawings]

【図１】本発明の文字認識方式を用いた文字認識装置
の主要部の構成の一実施例を示すブロック図である。FIG. 1 is a block diagram showing an example of a configuration of a main part of a character recognition device using a character recognition system of the present invention.

【図２】認識対象の文字列を記述した帳票の例を示す
図である。FIG. 2 is a diagram showing an example of a form in which a character string to be recognized is described.

【図３】照合値、頻度情報および補正照合値の例を示
す説明図である。FIG. 3 is an explanatory diagram showing examples of matching values, frequency information, and corrected matching values.

【図４】従来の文字認識方式を用いた文字認識装置の
主要部の構成を示すブロック図である。FIG. 4 is a block diagram showing a configuration of a main part of a character recognition device using a conventional character recognition method.

【図５】従来の文字認識装置における文字カテゴリ別
の使用可否情報の例を示す説明図である。FIG. 5 is an explanatory diagram showing an example of usability information for each character category in a conventional character recognition device.

[Explanation of symbols]

１…特徴抽出部、２…辞書メモリ部、３…識別照合部、
６…候補文字カテゴリソート部、７−１，７−２…文字
カテゴリ頻度メモリ部、８…メモリ選択部、９…照合値
補正部。1 ... Feature extraction unit, 2 ... Dictionary memory unit, 3 ... Identification and collation unit,
6 ... Candidate character category sorting section, 7-1, 7-2 ... Character category frequency memory section, 8 ... Memory selecting section, 9 ... Collation value correcting section.

Claims

[Claims]

1. A feature extraction means for analyzing a feature of a character pattern read from a recognition target range to create a feature vector for identification, and collation information representing a standard feature of a large number of recognition target character patterns. The dictionary memory means stored for each category and the collation information stored in the dictionary memory means are sequentially read in character category units, collated with the feature vector, and a collation value representing the similarity of the feature vector to the collation information is presented in character category units. In the character recognition method provided with the identification and collation means for outputting to, the collation values obtained in all the character categories, and the candidate character category sorting means for outputting the character category candidates of the character pattern read from the recognition target range, A character character that stores frequency information regarding the appearance probability of each character category that appears in the recognition target range. A category frequency memory means and a matching value correcting means for correcting the matching value for each character category obtained by the identification matching means based on the frequency information for each character category are provided, and the matching value corrected by this matching value correcting means is provided. A character recognition method characterized by inputting to the candidate character category sorting means and outputting a character category candidate of a character pattern read from a recognition target range.

2. The character category frequency memory means is provided corresponding to each of a plurality of recognition target ranges, and the frequency information stored in one character category frequency memory means corresponding to the current recognition target range is read out. 2. The character recognition method according to claim 1, wherein the matching value for each character category obtained by the identification and matching means is corrected.

3. The character category frequency memory means is provided for each recognition target range in which a different character category appears among a plurality of recognition target ranges, and is stored in one character category frequency memory means corresponding to the current recognition target among them. The character recognition method according to claim 1, wherein the frequency information is read and the matching value for each character category obtained by the identification and matching means is corrected.