JP2001084333A

JP2001084333A - Character reader

Info

Publication number: JP2001084333A
Application number: JP25942499A
Authority: JP
Inventors: Michio Nagashima; 道夫永島; Yoshiyasu Masuda; 佳泰増田
Original assignee: VASARA RES Inc; VASARA RESEARCH Inc
Current assignee: VASARA RES Inc; VASARA RESEARCH Inc
Priority date: 1999-09-13
Filing date: 1999-09-13
Publication date: 2001-03-30

Abstract

PROBLEM TO BE SOLVED: To provide a character reader of a high recognition rate. SOLUTION: Character information entered or the like on a paper sheet medium 9 is optically read in an image scanner part 8 and the image data Din are stored in a data buffer 2 as two-dimensional image data Dxy. A character area detection part 3 divides them into the image data Dc of respective character units and outputs them. The data conversion parts (a1)-(an) of plural systems (n) affine-transform the image data Dc and generate rotary image data P1-Pn rotated to respectively prescribed rotation angles. The feature extraction parts b1-bn of the plural systems (n) similarly extract feature extraction data α1-αn from the rotary image data P1-Pn and a judgment part 4 obtains a parameter (estimated value) ωclosest to a character corresponding to the image data Dc by executing a signal processing by a maximum likelihood estimation method to the feature extraction data α1-αn. Then, a collation part 5 reads character code data corresponding to the estimated value ω from a character code storage part 7 and outputs them.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、例えば用紙等の媒
体に記入あるいは印刷された文字を自動認識する文字読
取り装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character reader for automatically recognizing characters written or printed on a medium such as paper.

【０００２】[0002]

【従来の技術】従来の文字読取り装置にあっては、用紙
媒体上に記入あるいは印刷された文字を光学的手段で読
み取り、それによって得られる２次元画像データを信号
処理することで文字の特徴を抽出している。更に、予め
準備しておいた多数種類の各文字コードデータに対応す
る参照用特徴データと、上記特徴抽出によって得られた
特徴データとを対比し、最も類似性のある文字コードデ
ータを認識結果として出力することとしている。2. Description of the Related Art In a conventional character reading apparatus, characters written or printed on a paper medium are read by optical means, and two-dimensional image data obtained thereby is subjected to signal processing to characterize the characters. Has been extracted. Furthermore, the reference feature data corresponding to each of the various types of character code data prepared in advance is compared with the feature data obtained by the feature extraction, and the character code data having the most similarity is recognized as a recognition result. It is to be output.

【０００３】[0003]

【発明が解決しようとする課題】ところで、近年のマイ
クロコンピュータ等を適用した電子機器の発展に伴い、
文字情報をデジタルデータとして管理運用することが極
めて重要になっている。このため、文字読取り装置の適
用分野が特定の限られた文書に対応するだけでは、電子
機器への対応が不十分となり、新聞、雑誌等の一般文書
や、任意の形態の用紙媒体等に記入あるいは印刷された
文字情報をより高い認識率で読み取ることが大きな課題
となっている。With the recent development of electronic equipment to which a microcomputer or the like is applied,
It is extremely important to manage and operate character information as digital data. For this reason, if the field of application of the character reading device only corresponds to a specific limited document, the response to electronic devices becomes insufficient, and it is necessary to fill in general documents such as newspapers and magazines and paper media of any form. Alternatively, reading printed character information at a higher recognition rate has been a major issue.

【０００４】更に、従来の光学的手段で読み取った２次
元画像データを信号処理することで文字の特徴を抽出し
且つ認識率を高めるために精密な演算処理を行おうとす
ると、処理に長時間を要することから、ユーザーに対し
利便性を提供できなくなる等の問題があった。[0004] Further, if signal processing is performed on two-dimensional image data read by conventional optical means to extract character features and perform precise arithmetic processing in order to increase the recognition rate, the processing takes a long time. Because of the necessity, there has been a problem that convenience cannot be provided to the user.

【０００５】特に、光学的手段で用紙媒体の文字情報を
読み取る際、光学的手段に対し文字情報が傾いて（回転
して）読み取られた場合に、２次元画像データに対して
傾き補正の演算処理を行うことで認識率を高めるように
しているが、この演算処理には極めて長時間を要すると
いう問題があった。In particular, when character information on a paper medium is read by an optical means, if the character information is read while being tilted (rotated) with respect to the optical means, a tilt correction operation is performed on the two-dimensional image data. Although the recognition rate is increased by performing the processing, there is a problem that this arithmetic processing takes an extremely long time.

【０００６】本発明はこうした課題を克服するために成
されたものであり、文字の認識率を高め且つ処理時間の
短縮化を可能とする文字読取り装置を提供することを目
的とする。SUMMARY OF THE INVENTION The present invention has been made to overcome such problems, and an object of the present invention is to provide a character reading apparatus capable of improving the character recognition rate and shortening the processing time.

【０００７】[0007]

【課題を解決するための手段】上記目的を達成するため
本発明は、媒体上に記入あるいは印刷された文字情報を
自動認識する文字読取り装置であって、上記文字情報を
画像データとして検知する検知手段と、上記検知手段が
検知した上記画像データを所定の回転角度で回転変換処
理を施すことにより回転画像データを生成するデータ変
換手段と、上記データ変換手段で生成された上記回転画
像データの特徴を抽出して特徴データを生成する特徴抽
出手段と、統計的信号処理により、上記特徴抽出手段で
生成された特徴データに基づいて上記文字情報を特定す
る判定手段を備え、上記データ変換手段と特徴抽出手段
は、少なくとも２以上の組み合わせから成る複数の処理
系統によって構成され、上記判定手段は、上記複数の処
理系統における上記複数の特徴抽出手段で生成された上
記特徴データに基づいて上記文字情報を特定することを
特徴とする。According to the present invention, there is provided a character reading apparatus for automatically recognizing character information written or printed on a medium, wherein the character reading device detects the character information as image data. Means, data conversion means for generating rotation image data by performing rotation conversion processing on the image data detected by the detection means at a predetermined rotation angle, and characteristics of the rotation image data generated by the data conversion means A characteristic extracting means for extracting the character information based on the characteristic data generated by the characteristic extracting means by statistical signal processing. The extracting means is constituted by a plurality of processing systems composed of at least two or more combinations, and the determining means comprises an upper part of the plurality of processing systems. And identifies the character information based on the feature data generated by the plurality of feature extraction means.

【０００８】かかる構成によると、媒体に記入等されて
いる文字情報を検知手段が検知し、画像データとして出
力する。この文字情報を有する画像データは、複数の処
理系統におけるそれぞれのデータ回転手段に並列供給さ
れる。各データ回転手段はそれぞれ特定の回転角度で上
記画像データを回転させるための変換処理を行い、この
変換処理によって生成されたそれぞれの回転画像データ
を対応する各特徴抽出手段に供給する。各特徴抽出手段
は、上記各回転画像データの特徴を抽出し、それぞれの
特徴データを判定手段に並列出力する。判定手段は、こ
れら並列的に供給される複数の特徴データを統計的信号
処理することにより文字情報を特定する。According to this configuration, the detecting means detects the character information written on the medium and outputs the image information as image data. The image data having the character information is supplied in parallel to respective data rotating units in a plurality of processing systems. Each data rotator performs a conversion process for rotating the image data at a specific rotation angle, and supplies each rotation image data generated by this conversion process to a corresponding feature extractor. Each feature extracting unit extracts the feature of each of the rotated image data and outputs the respective feature data to the determining unit in parallel. The determination means specifies character information by performing statistical signal processing on the plurality of feature data supplied in parallel.

【０００９】このように、データ変換手段と特徴抽出手
段を複数系統備えて並列処理することで、高速の文字認
識を可能としている。As described above, high-speed character recognition is made possible by providing a plurality of data conversion means and feature extraction means and performing parallel processing.

【００１０】また、それぞれ特定の回転角度で回転され
た複数の回転画像データから複数の特徴データを生成
し、これら回転角度のパラメータの異なる特徴データを
統計的信号処理することで文字情報を特定するので、上
記検知手段が媒体上の文字情報を検知する際に検知手段
に対し文字情報が傾いて（回転して）検知された場合で
も、その傾きによって生じる誤差要因を自動補正して、
文字情報の特定精度（認識精度）を向上させる。In addition, a plurality of feature data are generated from a plurality of rotation image data rotated by a specific rotation angle, respectively, and character information is specified by performing statistical signal processing on the feature data having different rotation angle parameters. Therefore, even when the detection means detects character information on the medium when the character information is tilted (rotated) with respect to the detection means, an error factor caused by the tilt is automatically corrected,
Improve the identification accuracy (recognition accuracy) of character information.

【００１１】また、文字読取り装置において、予め文字
コードデータを記憶する記憶手段と、上記判定手段の特
定結果に基づいて上記記憶手段に記憶されている上記文
字コードデータを照合することにより、上記文字情報に
対応する文字コードデータを出力する照合手段を備える
ことを特徴とする。Further, in the character reading device, the character code data stored in the storage means is collated with the storage means for storing the character code data in advance, and the character code data stored in the storage means is collated based on the specified result of the determination means. It is characterized by comprising a collating means for outputting character code data corresponding to the information.

【００１２】かかる構成により、判定手段が特定した文
字情報に対応する文字コードデータが照合手段から出力
される。この文字コードデータをコンピュータや通信機
器等に供給することでデジタル電子機器への対応を可能
にする。With this configuration, the character code data corresponding to the character information specified by the determining means is output from the comparing means. By supplying this character code data to a computer, a communication device, or the like, it is possible to support digital electronic devices.

【００１３】[0013]

【発明の実施の形態】以下、本発明の実施の形態を図面
を参照して説明する。図１は、本実施形態に係る文字読
取り装置の要部構成を示すブロック図、図２は、データ
変換部ａ１〜ａｎの機能を模式的に示した説明図、図３
は、文字読取り装置の動作を説明するためのフローチャ
ートである。Embodiments of the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram showing a configuration of a main part of the character reading device according to the present embodiment. FIG. 2 is an explanatory diagram schematically showing functions of data conversion units a1 to an.
5 is a flowchart for explaining the operation of the character reading device.

【００１４】図１において、この文字読取り装置は、シ
ステムコントローラ１、データバッファ２、文字領域検
出部３、判定部４、照合部５、参照データ記憶部６、文
字コード記憶部７、イメージスキャナー部８、複数個ｎ
（ｎは２以上の任意の数）のデータ変換部ａ１〜ａｎ、
及び複数個ｎの特徴抽出部ｂ１〜ｂｎを備えて構成され
ている。ここで、文字領域検出部３と判定部４、照合部
５、データ変換部ａ１〜ａｎ、及び特徴抽出部ｂ１〜ｂ
ｎは、システムコントローラ１の制御下で動作するデジ
タルシグナルプロセッサ（Digital Signal Processor：
ＤＳＰ）で形成されている。In FIG. 1, the character reading apparatus includes a system controller 1, a data buffer 2, a character area detecting section 3, a determining section 4, a collating section 5, a reference data storing section 6, a character code storing section 7, an image scanner section. 8, multiple n
(N is an arbitrary number of 2 or more) data conversion units a1 to an,
And a plurality of n feature extraction units b1 to bn. Here, the character region detection unit 3 and the determination unit 4, the collation unit 5, the data conversion units a1 to an, and the feature extraction units b1 to b
n is a digital signal processor (Digital Signal Processor: operating under the control of the system controller 1).
DSP).

【００１５】イメージスキャナ部８は、副走査方向ｘに
沿って微細な受光素子が多数配列されて成るラインセン
サ（図示省略）を備え、副走査方向ｘに対して直交する
方向（以下、主走査方向という）ｙより供給される用紙
媒体９を光学的に読み取るようになっている。The image scanner unit 8 includes a line sensor (not shown) in which a number of fine light receiving elements are arranged along the sub-scanning direction x, and a direction orthogonal to the sub-scanning direction x (hereinafter, main scanning). The sheet medium 9 supplied from the direction y) is optically read.

【００１６】例えば、ユーザーがイメージスキャナ部８
を用紙媒体９の表面に対向配置し、これらイメージスキ
ャナ部８と用紙媒体９を主走査方向ｙに沿って相互に逆
の方向に移動させると、イメージスキャナ部８の上記ラ
インセンサが用紙媒体９に対して主走査読取りと副走査
読取りを繰り返すことにより、用紙媒体９に記載されて
いる文字情報を２次元的に光学読み取りする。また、図
示していないが、イメージスキャナ部８を固定してお
き、自動給紙機構によって用紙媒体９をイメージスキャ
ナ部８側に自動搬送することによっても、上記の２次元
的な光学読み取りが行われる。For example, when the user operates the image scanner unit 8
When the image scanner unit 8 and the sheet medium 9 are moved in opposite directions along the main scanning direction y, the line sensor of the image scanner unit 8 causes the line sensor of the image scanner unit 8 to move. By repeating the main scanning reading and the sub scanning reading, character information written on the paper medium 9 is optically read two-dimensionally. Although not shown, the two-dimensional optical reading can also be performed by fixing the image scanner unit 8 and automatically conveying the paper medium 9 to the image scanner unit 8 by an automatic paper feed mechanism. Will be

【００１７】システムコントローラ１は、予め設定され
ているコンピュータプログラムを実行することにより上
記イメージスキャナ部８の読取り動作を制御すると共
に、本文字読取り装置全体の動作を制御するマイクロプ
ロセッサ（ＭＰＵ）が内蔵されている。The system controller 1 controls a reading operation of the image scanner unit 8 by executing a preset computer program, and has a built-in microprocessor (MPU) for controlling the operation of the entire character reading apparatus. Have been.

【００１８】データバッファ２は、消去及び再記憶可能
なＤＲＡＭ（Dynamic ＲＡＭ）等の半導体メモリで形成
されており、上記の光学読み取りの際にイメージスキャ
ナ部８から線順次に出力される画素データＤinを入力
し、フレーム単位の２次元ドットマトリックスデータ
（以下、画像データという）Ｄxyにして記憶する。The data buffer 2 is formed of a semiconductor memory such as a DRAM (Dynamic RAM) which can be erased and re-stored, and pixel data Din output line-sequentially from the image scanner unit 8 at the time of the above optical reading. Is input and stored as two-dimensional dot matrix data (hereinafter, referred to as image data) Dxy in frame units.

【００１９】文字領域検出部３は、データバッファ２に
記憶された画像データＤxyのうち、個々の文字の部分に
該当する画素データと背景の部分に該当する画素データ
とを識別し、個々の文字の領域（範囲）を特定する。す
なわち、画像データＤxyを構成している各画素データの
階調レベルを所定の閾値と比較し、所定の閾値より階調
レベルの小さな画素データを背景とし、所定の閾値より
階調レベルの大きな画素データを文字の部分として識別
し、文字の部分に該当する画素データの比較的まとまっ
た領域を個々の文字の領域（フォントの大きさ）として
判定して、その領域毎に画像データＤxyを区分けする。
そして、区分けされた個々の画像データＤcを個々の文
字の情報として出力する。尚、副走査方向ｘと主走査方
向ｙに対応する直交座標系に基づいて個々の画像データ
Ｄｃを矩形状に区分け処理する。The character area detecting section 3 identifies pixel data corresponding to an individual character portion and pixel data corresponding to a background portion in the image data Dxy stored in the data buffer 2, and Region (range) is specified. That is, the gradation level of each pixel data constituting the image data Dxy is compared with a predetermined threshold value, and pixel data having a gradation level smaller than the predetermined threshold value is set as a background, and a pixel having a gradation level larger than the predetermined threshold value is set as a background. The data is identified as a character part, and a relatively large area of pixel data corresponding to the character part is determined as an individual character area (font size), and the image data Dxy is divided for each area. .
Then, the divided individual image data Dc is output as individual character information. The individual image data Dc is divided into rectangular shapes based on an orthogonal coordinate system corresponding to the sub-scanning direction x and the main scanning direction y.

【００２０】データ変換部ａ１〜ａｎは、システムコン
トローラ１の指示に従って、文字領域検出部３から供給
される画像データＤｃを入力し、更に、入力した画像デ
ータＤｃをそれぞれ所定の回転角度θ１〜θｎで回転さ
せるための演算処理を行い、その演算処理で得られた２
次元の画像データ（以下、回転画像データという）Ｐ１
〜Ｐｎを出力する。The data converters a1 to an input the image data Dc supplied from the character area detector 3 in accordance with an instruction from the system controller 1, and further convert the input image data Dc into predetermined rotation angles θ1 to θn, respectively. Is performed to perform rotation, and 2 obtained by the calculation is performed.
Dimensional image data (hereinafter referred to as rotated image data) P1
~ Pn.

【００２１】図２を参照してより詳細に述べれば、デー
タ変換部ａ１は、同図（ａ）に示すような画像データＤ
ｃを回転角度θ１に基づいて回転処理することにより、
同図（ｂ）に示すような回転画像データＰ１を生成して
出力する。データ変換部ａ２は、同じ画像データＤｃを
回転角度θ２に基づいて回転処理することにより、同図
（ｃ）に示すような回転画像データＰ２を生成して出力
する。Referring to FIG. 2 in more detail, the data converter a1 outputs image data D as shown in FIG.
By rotating c based on the rotation angle θ1,
It generates and outputs rotated image data P1 as shown in FIG. The data conversion unit a2 generates and outputs rotated image data P2 as shown in FIG. 3C by performing a rotation process on the same image data Dc based on the rotation angle θ2.

【００２２】以下同様に、データ変換部ａ３〜ａｎも、
回転角度θ３〜θｎに基づいて画像データＤｃをそれぞ
れ回転処理することにより、同図（ｄ）〜（ｆ）に示す
ような回転画像データＰ３〜Ｐｎを生成して出力する。Similarly, the data conversion units a3 to an also
By rotating the image data Dc based on the rotation angles θ3 to θn, rotated image data P3 to Pn are generated and output as shown in FIGS.

【００２３】ここで、これらの回転角度θ１〜θｎは互
いに異なった値に設定され、各回転角度θ１〜θｎの具
体的な値は、システムコントローラ１中に備えられてい
るルックアップテーブル（図示省略）に予め記憶されて
いる。Here, these rotation angles θ1 to θn are set to different values from each other, and specific values of the rotation angles θ1 to θn are determined by a look-up table (not shown) provided in the system controller 1. ) Is stored in advance.

【００２４】更に、本実施形態では、データ変換部ａ１
の回転角度θ１は、通常は、θ１＝０°に設定されてお
り、他のデータ変換部ａ２〜ａｎの回転角度θ２〜θｎ
は、回転角度θ１を基準として、ほぼ同数ずつプラスの
角度とマイナスの角度に設定されるようになっている。Further, in this embodiment, the data converter a1
Is usually set to θ1 = 0 °, and the rotation angles θ2 to θn of the other data conversion units a2 to an
Are set to a plus angle and a minus angle by substantially the same number with respect to the rotation angle θ1.

【００２５】更にまた、本実施形態では、上記の回転処
理を行うためのアルゴリズムとして、アフィン変換（af
fine transformation）が用いられている。但し、本発
明はアフィン変換に限定されるものではなく、他の方法
を用いてもよい。Further, in the present embodiment, an affine transformation (af
fine transformation) is used. However, the present invention is not limited to the affine transformation, and another method may be used.

【００２６】特徴抽出部ｂ１〜ｂｎは、データ変換部ａ
１〜ａｎで生成された回転画像データＰ１〜Ｐｎに対し
特徴抽出処理を施すことにより、各回転画像データＰ１
〜Ｐｎに含まれている文字情報の特徴抽出を行い、更
に、この特徴抽出処理によって得られた各特徴抽出デー
タα１〜αｎを判定部３へ出力する。The feature extraction units b1 to bn are provided by a data conversion unit a
By performing a feature extraction process on the rotated image data P1 to Pn generated by 1 to an, each rotated image data P1
ＰPn are extracted, and the feature extraction data α1 to αn obtained by the feature extraction process are output to the determination unit 3.

【００２７】すなわち、各特徴抽出部ｂ１〜ｂｎは各デ
ータ変換部ａ１〜ａｎに従属接続されることで、これら
データ変換部ａ１〜ａｎ及び特徴抽出部ｂ１〜ｂｎによ
るｎ個の処理系統が構成されている。そして、所定タイ
ミングに同期してこれらのデータ変換部ａ１〜ａｎ及び
特徴抽出部ｂ１〜ｂｎが並列処理を行うことで、これら
ｎ個の並列系統における画像データＤｃの入力から特徴
抽出データα１〜αｎを出力するまでの処理期間がほぼ
等しくなっている。That is, each of the feature extraction units b1 to bn is connected to each of the data conversion units a1 to an so that n processing systems are constituted by the data conversion units a1 to an and the feature extraction units b1 to bn. Have been. The data conversion units a1 to an and the feature extraction units b1 to bn perform parallel processing in synchronization with a predetermined timing, so that the feature extraction data α1 to αn can be obtained from the input of the image data Dc in these n parallel systems. Are almost equal to each other.

【００２８】尚、本実施形態では、各特徴抽出部ｂ１〜
ｂｎは、パターンマッチング法や構造解析法等の特徴抽
出アルゴリズを用いて、上記の特徴抽出データα１〜α
ｎを生成している。In this embodiment, each of the feature extraction units b1 to b1
bn is calculated using the feature extraction data α1 to α1
n is generated.

【００２９】つまり、パターンマッチング法によると、
特徴抽出部ｂ１〜ｂｎや参照データ記憶部５の一部分に
設けられている記憶部（図示省略）に予め記憶されてい
る多数の文字画像の標準テンプレートデータと各回転画
像データＰ１〜Ｐｎとを照合し、その標準テンプレート
データと各回転画像データとの一致する部分を文字の構
成部分として抽出することで、特徴抽出データα１〜α
ｎを生成する。That is, according to the pattern matching method,
The standard template data of a large number of character images stored in advance in a storage unit (not shown) provided in a part of the feature extraction units b1 to bn and the reference data storage unit 5 is compared with each of the rotated image data P1 to Pn. Then, by extracting a part where the standard template data and each rotated image data coincide with each other as a constituent part of the character, the feature extraction data α1 to α
Generate n.

【００３０】構造解析法によると、各回転画像データＰ
１〜Ｐｎから文字の線素に該当するデータを抽出して解
析し、文字の端点、凹部、凸部、ループ部、及びそれら
の大きさや形の変化を特徴パラメータとして、特徴抽出
データα１〜αｎを生成する。According to the structural analysis method, each rotated image data P
1 to Pn, data corresponding to the line element of the character is extracted and analyzed, and the feature extraction data α1 to αn are obtained by using the end points, concave portions, convex portions, and loop portions of the character, and changes in their size and shape as characteristic parameters. Generate

【００３１】判定部３は、最尤推定法を用いて、特徴抽
出部ｂ１〜ｂｎから並列供給される特徴抽出データα１
〜αｎと参照データ記憶部６に予め記憶されている標準
の特徴データとの尤度関数を求め、その尤度関数（同時
確率密度）を最大にするパラメータωを、認識対象の文
字（画像データＤｃの文字）に最も近い推定値と判定し
て出力する。The judgment unit 3 uses the maximum likelihood estimation method to extract feature extraction data α1 supplied in parallel from the feature extraction units b1 to bn.
To αn and standard feature data stored in the reference data storage unit 6 in advance, and a parameter ω that maximizes the likelihood function (simultaneous probability density) is set as a parameter ω to be recognized as a recognition target character (image data (The character of Dc).

【００３２】参照データ記憶部６は、読み出し専用メモ
リ（ＲＯＭ）で形成されており、標準文字（ＪＩＳ第１
水準、ＪＩＳ第２水準等に準拠した文字）に対応する標
準の特徴データが予め記憶されている。つまり、多数の
標準文字のそれぞれに対応する特徴データが多数記憶さ
れており、判定部３が上記の判定処理を行うときに、こ
れらの特徴データを判定部３へ供給する。The reference data storage section 6 is formed of a read-only memory (ROM), and stores standard characters (JIS first).
Standard feature data corresponding to the level, characters conforming to the JIS second level, etc.) are stored in advance. That is, a large number of feature data corresponding to each of a large number of standard characters is stored, and when the determination unit 3 performs the above-described determination processing, the determination unit 3 supplies these characteristic data to the determination unit 3.

【００３３】文字コード記憶部７は、読み出し専用メモ
リ（ＲＯＭ）で形成されており、上記標準文字に対応す
る文字コードデータが予め記憶されている。The character code storage unit 7 is formed of a read-only memory (ROM), and stores character code data corresponding to the standard characters in advance.

【００３４】照合部５は、判定部３から供給されるパラ
メータ（推定値）ωのデータに基づいて文字コード記憶
部７を参照し、そのデータに対応する文字コードデータ
を認識結果のデータとして出力する。The collation unit 5 refers to the character code storage unit 7 based on the parameter (estimated value) ω data supplied from the determination unit 3 and outputs the character code data corresponding to the data as recognition result data. I do.

【００３５】次に、かかる構成を有する文字読取り装置
の動作例を図３のフローチャートを参照して説明する。Next, an example of the operation of the character reading apparatus having such a configuration will be described with reference to the flowchart of FIG.

【００３６】同図において、ユーザーが主電源等をオン
に設定すると、文字読取り装置は動作を開始し、用紙媒
体９がイメージスキャナ部８に対して供給されるまで待
機する（ステップ１００）。次に、用紙媒体９がイメー
ジスキャナ部８に対して供給されると、搬送機構等が動
作を開始して用紙媒体９を自動搬送（自動給紙）し、そ
の搬送に伴ってイメージスキャナ部８が用紙媒体９に記
入等されている文字情報を読み取り、その読み取った画
像データＤinをデータバッファ２に記憶させる（ステッ
プ１０２）。そして、用紙媒体９の終端がイメージスキ
ャナ部８を通過すると、読取りを完了する（ステップ１
０４）。In FIG. 3, when the user turns on the main power supply or the like, the character reading apparatus starts its operation and waits until the paper medium 9 is supplied to the image scanner section 8 (step 100). Next, when the paper medium 9 is supplied to the image scanner unit 8, the transport mechanism or the like starts operating to automatically transport (automatically feed) the paper medium 9, and accompanying the transport, the image scanner unit 8 Reads the character information written on the paper medium 9 and stores the read image data Din in the data buffer 2 (step 102). When the end of the paper medium 9 passes through the image scanner section 8, the reading is completed (step 1).
04).

【００３７】次に、ステップ１０６において、文字領域
検出部３が、データバッファ２に記憶された画像データ
Ｄxyを個々の文字の領域に区分けし、区分けされた最初
の画像データＤｃをデータ変換部ａ１〜ａｎへ出力す
る。Next, in step 106, the character area detecting section 3 divides the image data Dxy stored in the data buffer 2 into individual character areas, and divides the first image data Dc into data conversion sections a1. Output to ~ an.

【００３８】次に、ステップ１０８において、データ変
換部ａ１〜ａｎがシステムコントローラ１からの指示に
従ってそれぞれ異なった回転角度θ１〜θｎを設定し、
次にステップ１１０において、画像データＤｃに対して
それぞれ異なった回転角度θ１〜θｎでアフィン変換処
理を行うことにより、図２に示したような回転画像デー
タＰ１〜Ｐｎを生成する。Next, in step 108, the data converters a1 to an set different rotation angles θ1 to θn in accordance with instructions from the system controller 1, respectively.
Next, in step 110, the image data Dc is subjected to affine transformation processing at different rotation angles θ1 to θn to generate rotated image data P1 to Pn as shown in FIG.

【００３９】次に、特徴抽出部ｂ１〜ｂｎが、回転画像
データＰ１〜Ｐｎのそれぞれについて特徴抽出を行うこ
とで、特徴抽出データα１〜αｎを生成し（ステップ１
１２）、次に、判定部４が、最尤推定法により、特徴抽
出データα１〜αｎから文字に最も近い推定値を判定す
る（ステップ１１４）。Next, the feature extraction units b1 to bn perform feature extraction on each of the rotated image data P1 to Pn, thereby generating feature extraction data α1 to αn (step 1).
12) Next, the determination unit 4 determines the estimated value closest to the character from the feature extraction data α1 to αn by the maximum likelihood estimation method (step 114).

【００４０】次に、ステップ１１６において、上記の推
定値に基づいて文字の特定が可能か否か判定し、特定で
きなかった場合には、回転角度θ１〜θｎを微調整して
（ステップ１２２）、ステップ１１０からの処理を繰り
返す。Next, in step 116, it is determined whether or not the character can be specified based on the estimated value. If the character cannot be specified, the rotation angles θ1 to θn are finely adjusted (step 122). , The processing from step 110 is repeated.

【００４１】一方、ステップ１１６において、文字の特
定が可能となった場合には、ステップ１１８の処理に移
行し、照合部５が上記推定値に対応する文字コードデー
タを文字コード記憶部７から読み出して出力する。On the other hand, if it is determined in step 116 that the character can be specified, the process proceeds to step 118, where the collating unit 5 reads the character code data corresponding to the estimated value from the character code storage unit 7. Output.

【００４２】次に、ステップ１２４において、データバ
ッファ２に記憶されている全ての文字の画像データにつ
いての認識処理が完了したか判断し、完了した場合に
は、用紙媒体９の１枚分の文字認識処理を終了する。Next, at step 124, it is determined whether or not the recognition processing for all the character image data stored in the data buffer 2 has been completed. The recognition processing ends.

【００４３】一方、ステップ１２４において、未だ完了
していないと判断すると、ステップ１２６の処理に移行
して、文字領域検出部３が次の文字に該当する画像デー
タＤｃをデータ変換部ａ１〜ａｎ側へ転送した後、ステ
ップ１０８からの処理が繰り返えされる。On the other hand, if it is determined in step 124 that the processing has not been completed, the process proceeds to step 126, in which the character area detecting section 3 converts the image data Dc corresponding to the next character into the data conversion sections a1 to an. After the transfer, the processing from step 108 is repeated.

【００４４】このように、本実施形態の文字読取り装置
によれば、データ変換部ａ１〜ａｎと特徴抽出部ｂ１〜
ｂｎを複数系統備えて並列処理するようにしたので、高
速の文字認識が可能となる。また、それぞれ特定の回転
角度θ１〜θｎで回転された複数の回転画像データＰ１
〜Ｐｎから複数の特徴抽出データα１〜αｎを生成し、
これら各回転角度θ１〜θｎのパラメータの異なる特徴
抽出データα１〜αｎを、最尤推定法によって演算処理
することにより、文字情報を特定（認識）するので、検
知手段であるイメージスキャナ部８が用紙媒体９上の文
字情報を検知する際に、文字情報が傾いて（回転して）
検知された場合でも、その傾きによって生じる誤差要因
を自動補正して、文字情報の認識精度を向上させること
ができる。As described above, according to the character reader of this embodiment, the data conversion units a1 to an and the feature extraction units b1 to
Since bn is provided in a plurality of systems and parallel processing is performed, high-speed character recognition becomes possible. Further, a plurality of pieces of rotated image data P1 rotated by specific rotation angles θ1 to θn, respectively.
To generate a plurality of feature extraction data α1 to αn from
Character information is specified (recognized) by performing arithmetic processing on these feature extraction data α1 to αn having different parameters of the rotation angles θ1 to θn by the maximum likelihood estimation method. When detecting the character information on the medium 9, the character information is tilted (rotated).
Even in the case of detection, the error factor caused by the inclination can be automatically corrected, and the recognition accuracy of the character information can be improved.

【００４５】尚、本実施形態では、統計的信号処理とし
て最尤推定法を用いているが、本発明はこれに限定され
るものでなく、最尤推定法以外の統計的信号処理を適用
しても良い。In the present embodiment, the maximum likelihood estimation method is used as the statistical signal processing. However, the present invention is not limited to this, and a statistical signal processing other than the maximum likelihood estimation method is applied. May be.

【００４６】[0046]

【発明の効果】以上説明したように本発明によれば、デ
ータ変換手段と特徴抽出手段を複数系統備えて並列処理
するようにしたので、高速の文字認識が可能となる。ま
た、それぞれ特定の回転角度で回転された複数の回転画
像データから複数の特徴データを生成し、これら回転角
度のパラメータの異なる特徴データを統計的信号処理す
ることで文字情報を認識するので、上記検知手段が媒体
上の文字情報を検知する際に検知手段に対し文字情報が
傾いて（回転して）検知された場合でも、その傾きによ
って生じる誤差要因を自動補正して、文字情報の認識精
度を向上させることができる。As described above, according to the present invention, a plurality of data conversion means and feature extraction means are provided for parallel processing, so that high-speed character recognition becomes possible. In addition, character information is recognized by generating a plurality of feature data from a plurality of rotation image data rotated at specific rotation angles, respectively, and performing statistical signal processing on the feature data having different rotation angle parameters. Even when the detecting means detects the character information on the medium while the character information is tilted (rotated) with respect to the detecting means, the error factor caused by the inclination is automatically corrected, and the character information recognition accuracy is improved. Can be improved.

[Brief description of the drawings]

【図１】本実施形態に係る文字読取り装置の構成を示す
ブロック図である。FIG. 1 is a block diagram illustrating a configuration of a character reading device according to an embodiment.

【図２】データ変換部の機能を模式的に示した説明図で
ある。FIG. 2 is an explanatory diagram schematically showing a function of a data conversion unit.

【図３】本実施形態に係る文字読取り装置の動作を説明
するためのフローチャートである。FIG. 3 is a flowchart for explaining the operation of the character reading device according to the embodiment.

[Explanation of symbols]

１…システムコントローラ２…データバッファ３…文字領域検出部４…判定部５…照合部６…照合データ記憶部７…文字コード記憶部８…イメージスキャナ部９…用紙媒体ａ１〜ａｎ…データ変換部ｂ１〜ｂｎ…特徴抽出部 DESCRIPTION OF SYMBOLS 1 ... System controller 2 ... Data buffer 3 ... Character area detection part 4 ... Judgment part 5 ... Collation part 6 ... Collation data storage part 7 ... Character code storage part 8 ... Image scanner part 9 ... Paper medium a1-an ... Data conversion part b1 to bn: feature extraction unit

Claims

[Claims]

1. A character reading apparatus for automatically recognizing character information written or printed on a medium, comprising: a detecting unit for detecting the character information as image data; and a detecting unit for detecting the image data detected by the detecting unit. A data conversion unit that generates rotation image data by performing a rotation conversion process at a rotation angle of: a feature extraction unit that generates a feature data by extracting a feature of the rotation image data generated by the data conversion unit; Determining means for specifying the character information based on the feature data generated by the feature extracting means by statistical signal processing; wherein the data converting means and the feature extracting means comprise a plurality of processes comprising at least two or more combinations And the determination unit is configured to output the feature data generated by the plurality of feature extraction units in the plurality of processing systems. Character reader and identifies the character information based on.

2. A storage unit for storing character code data in advance, and a character corresponding to the character information by collating the character code data stored in the storage unit based on a specified result of the determination unit. 2. The character reading device according to claim 1, further comprising a matching unit that outputs code data.