JPH0950489A

JPH0950489A - Character recognizing device

Info

Publication number: JPH0950489A
Application number: JP7241415A
Authority: JP
Inventors: Yoshinori Ookuma; 好憲大熊; Isao Sugano; 功菅野
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1995-05-30
Filing date: 1995-09-20
Publication date: 1997-02-18

Abstract

PROBLEM TO BE SOLVED: To provide a character recognizing device which is capable of preventing the lowering of the recognition rate caused by the difference of the kind of a writing tool entering character in a medium to be read. SOLUTION: In the character recognizing device provided with a writing tool discrimination part 2, a feature extraction part 3 and a discrimination means 4, the feature extraction part 3 has a constitution that feature amt. to be extracted is made to differ according to the discrimination result obtained by the writing tool discrimination part 2. In the character recognizing device provided with the writing tool discrimination part, the feature extraction part, a blurring processing part and the discrimination part, the blurring processing part has a constitution that the degree of the blurring processing to be applied to the feature amt. of a character pattern is made to differ according the discrimination result obtained by the writing tool discrimination part.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】この発明は、文字、記号など
（以下、「文字」という。）の書かれた被読取媒体を光
学的に走査して得られるイメージから文字を読み取る文
字認識装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character recognition device for reading characters from an image obtained by optically scanning a medium to be read in which characters, symbols and the like (hereinafter referred to as "characters") are written. Is.

【０００２】[0002]

【従来の技術】紙や帳票などの媒体に文字を記入する際
に用いる筆記具は、一般に、黒色の鉛筆やボールペンと
される。しかし、同じ黒色で書かれた文字であっても、
鉛筆で書かれた文字とボールペンで書かれた文字とで
は、文字の濃度、文字のエッジの状態、文字の太さなど
が違うため、両文字に対し共通の文字認識処理を行なっ
た場合は認識率に悪影響を与えることが多い。そこでこ
れを回避するため、例えば特開昭６２−１１２１２４号
公報には、文字記入に使用した筆記具と同じ筆記具で被
読取媒体の所定の個所に設けられているマークを塗りつ
ぶし、この塗りつぶしたマークをセンサで読取り、この
センサの出力を筆記具の種類別に予め用意された基準値
と比較して使用した筆記具を判別すると共に、この判別
に応じて選択される光学フィルタを介し文字パターンを
読み取るようにした、光学文字読取装置が開示されてい
る。2. Description of the Related Art A writing instrument used for writing characters on a medium such as paper or a form is generally a black pencil or a ballpoint pen. However, even if the letters are written in the same black,
Characters written with a pencil and those written with a ballpoint pen differ in the density of characters, the state of the edges of characters, the thickness of characters, etc. Therefore, when common character recognition processing is performed for both characters, they are recognized. Often negatively impacts rates. In order to avoid this, for example, in Japanese Unexamined Patent Publication No. 62-112124, a mark provided at a predetermined position on a medium to be read is filled with the same writing tool used for writing characters, and the filled mark is used. Reading with a sensor, comparing the output of this sensor with a reference value prepared in advance for each type of writing instrument to determine the writing instrument used, and to read the character pattern via an optical filter selected according to this determination , An optical character reader is disclosed.

【０００３】[0003]

【発明が解決しようとする課題】しかしながら、この従
来の光学文字読取装置は、選択された光学フィルタを通
して被読取媒体上の画情報を読み取るとあるように、あ
くまで画情報を好適な光学フィルタを用いて取り込む装
置であり、その後の情報処理については何ら言及のない
装置であった。したがって、必ずしも良好な認識精度が
得られる装置とは限らない。However, this conventional optical character reading apparatus uses only a suitable optical filter for image information, as in reading the image information on the medium to be read through the selected optical filter. It was a device that took in the information, and did not mention any subsequent information processing. Therefore, the device is not always one that can obtain good recognition accuracy.

【０００４】[0004]

【課題を解決するための手段】そこで、この出願の第一
発明では、被読取媒体に文字を記入した際に用いた筆記
具の種類を前記被読取媒体のイメージデータに基づいて
判別する筆記具判別部と、前記イメージデータから文字
パターンを得る前処理部と、前記文字パターンの特徴量
を抽出する特徴抽出部と、該抽出された前記文字パター
ンの特徴量を予め用意されている複数の標準文字パター
ンの特徴量と照合して前記文字パターンを識別する識別
部と、を具える文字認識装置において、前記特徴抽出部
を、前記筆記具判別部によって得られる判別結果に応じ
抽出する特徴量を異ならせる構成のものとしてあること
を特徴とする。Therefore, in the first invention of this application, a writing tool discriminating section for discriminating the type of writing instrument used when writing characters on a medium to be read based on image data of the medium to be read. A preprocessing unit for obtaining a character pattern from the image data, a feature extraction unit for extracting a feature amount of the character pattern, and a plurality of standard character patterns prepared in advance for the extracted feature amount of the character pattern. A character recognition device for identifying the character pattern by comparing with the feature amount of the above, the feature extracting unit is configured to vary the feature amount to be extracted according to the determination result obtained by the writing instrument determining unit. It is characterized by being as.

【０００５】なお、この第一発明において抽出する特徴
量を異ならせるとは、具体的には、筆記具の種類の違い
によって発生するかすれやつぶれ等の変形を吸収し得る
ように特徴量を異ならせること、すなわち筆記具の種類
にかかわらず標準文字パターンの特徴量に対応した安定
した特徴量が得られるように特徴量を異ならせることで
ある。In the first aspect of the present invention, differentiating the characteristic amount to be extracted specifically means differentiating the characteristic amount so as to absorb deformation such as blurring or crushing caused by a difference in the type of writing instrument. That is, the feature amounts are made different so that a stable feature amount corresponding to the feature amount of the standard character pattern can be obtained regardless of the type of writing instrument.

【０００６】またこの出願の第二発明では、被読取媒体
に文字を記入した際に用いた筆記具の種類を前記被読取
媒体のイメージデータに基づいて判別する筆記具判別部
と、前記イメージデータから文字パターンを得る前処理
部と、前記文字パターンの特徴量を抽出する特徴抽出部
と、該抽出された前記文字パターンの特徴量にぼけ処理
を施すぼけ処理部と、前記文字パターンの前記ぼけ処理
の済んだ特徴量を予め用意されている複数の標準文字パ
ターンの特徴量と照合して前記文字パターンを識別する
識別部と、を具える文字認識装置において、前記ぼけ処
理部を、前記筆記具判別部で得られた判別結果に応じ前
記文字パターンの特徴量に施すぼけ処理の程度を異なら
せる構成のものとしてあることを特徴とする。In the second invention of this application, a writing tool discriminating section for discriminating the kind of writing instrument used when a character is written on the medium to be read based on the image data of the medium to be read, and a character from the image data. A pre-processing unit for obtaining a pattern, a feature extraction unit for extracting a feature amount of the character pattern, a blur processing unit for performing a blur process on the extracted feature amount of the character pattern, and a blur processing unit for the blur process of the character pattern. In a character recognition device, comprising: an identification unit that identifies the character pattern by collating the completed feature amount with the feature amounts of a plurality of standard character patterns that are prepared in advance, the blur processing unit is the writing instrument determination unit. It is characterized in that the degree of blurring processing to be applied to the feature amount of the character pattern is varied according to the discrimination result obtained in (1).

【０００７】ここで、ぼけ処理の程度を異ならせると
は、具体的には、筆記具の種類の違いによって発生する
かすれやつぶれ等の変形を吸収し得るように特徴量に対
しぼけ処理を実施すること、すなわち筆記具の種類にか
かわらず標準文字パターンの特徴量に対応した安定した
特徴量が得られるようなぼけ処理を実施することであ
る。Here, to make the degree of blurring different is specifically, the blurring is performed on the feature amount so as to absorb the deformation such as blurring or crushing caused by the difference in the type of writing instrument. That is, the blurring process is performed so that a stable feature amount corresponding to the feature amount of the standard character pattern can be obtained regardless of the type of writing instrument.

【０００８】[0008]

【作用】この出願の第一発明の構成によれば、筆記具の
種類の違いに起因するかすれやつぶれ等の除去が可能な
特徴抽出処理ができ、また、第二発明の構成によれば筆
記具の種類の違いに起因するかすれやつぶれ等の除去が
可能なぼけ処理ができる。このため、いずれの発明の場
合も、筆記具の種類の違いに起因するかすれやつぶれ等
が除去された特徴量が得られかつこれを用いての認識処
理が行なえる文字認識装置が得られる。According to the constitution of the first invention of this application, the feature extraction processing capable of removing the blur and the crushing due to the difference in the type of the writing implement can be performed, and according to the constitution of the second invention of the writing implement. It is possible to perform blurring processing that can remove blurring or crushing caused by a difference in type. Therefore, in any of the inventions, it is possible to obtain a character recognizing device that can obtain a feature amount from which blurring or crushing due to a difference in writing instrument type is removed and that can perform recognition processing using the feature amount.

【０００９】[0009]

【発明の実施の形態】以下、図面を参照してこの出願の
各発明の実施の形態についてそれぞれ説明する。ただ
し、説明に用いる各図はこの発明を理解出来る程度に概
略的に示してある。また、説明に用いる各図において同
様な構成成分については同一の番号を付し、その重複す
る説明を省略することもある。Embodiments of the present invention will be described below with reference to the accompanying drawings. However, the drawings used in the description are schematically shown to the extent that the present invention can be understood. In each of the drawings used for description, the same components are denoted by the same reference numerals, and overlapping description may be omitted.

【００１０】１．第一発明の説明図１は第一発明の実施の形態の文字認識装置の構成を示
すブロック図である。この実施の形態の文字認識装置
は、前処理部１と、筆記具判別部２と、特徴抽出部３
と、識別部４とを具える。[0010] 1. Description of First Invention FIG. 1 is a block diagram showing a configuration of a character recognition device according to an embodiment of the first invention. The character recognition device according to this embodiment includes a preprocessing unit 1, a writing instrument discrimination unit 2, and a feature extraction unit 3.
And an identification unit 4.

【００１１】ここで、前処理部１は任意好適な構成のも
のとすることができるが、この実施の形態の場合は図２
に示した構成としてある。すなわち、被読取媒体を走査
して得られる光信号を光電変換するための光電変換部１
１と、該光電変換部１１を介して得られる多値画像を格
納するための第１パターンレジスタ１２と、該第１パタ
ーンレジスタ１２に格納された多値画像の２値化処理を
行なうための２値化部１３とで構成してある。なおこの
実施の形態では、筆記具判別部２がＲ（赤）、Ｇ（緑）
およびＢ（青）の３色の画素データに基づいて（詳細は
後述する）筆記具の判別を行なう構成のものである例を
考えるので、前処理部１における光電変換部１１は、
Ｒ、Ｇ、Ｂ毎の濃度情報を読めるものとし、また、第１
パターンレジスタ１２は１画素につきＲ（赤）、Ｇ
（緑）およびＢ（青）の３色の濃度情報を格納できるも
のとしている。さらにこの第１パターンレジスタ１２
は、その上にＸ−Ｙ座標を仮想的に設定しこの座標系で
表される任意の位置の画素データを読み出すことができ
るものとしてある。Here, the pre-processing unit 1 may have any suitable configuration, but in the case of this embodiment, FIG.
It has the configuration shown in. That is, the photoelectric conversion unit 1 for photoelectrically converting an optical signal obtained by scanning the medium to be read.
1, a first pattern register 12 for storing a multi-valued image obtained through the photoelectric conversion unit 11, and a binary patterning process for the multi-valued image stored in the first pattern register 12. It is composed of a binarization unit 13. In addition, in this embodiment, the writing instrument determination unit 2 uses R (red) and G (green).
Considering an example in which the writing tool is discriminated (details will be described later) based on pixel data of three colors B and (B), the photoelectric conversion unit 11 in the preprocessing unit 1
The density information for each of R, G, and B can be read.
The pattern register 12 has R (red) and G per pixel.
The density information of three colors (green) and B (blue) can be stored. Further, this first pattern register 12
Is capable of virtually setting XY coordinates on it and reading out pixel data at an arbitrary position represented by this coordinate system.

【００１２】なお、この実施の形態におけるカラー画像
イメージでは、Ｒ、Ｇ、Ｂの濃度情報が全て０付近であ
るものを黒とする。In the color image of this embodiment, black is used when the density information of R, G and B is all near 0.

【００１３】また、筆記具判別部２は、この実施の形態
では、イメージデータとしての第１パターンレジスタ１
２に格納されたカラー画像から文字線画素の抽出を行な
い、該抽出された各画素のＲ、Ｇ、Ｂ値の濃度の違いに
基づいて筆記具の判別を行なうものとしてある（その詳
細な説明は後の動作説明において行なう。）。しかし、
筆記具判別部２の構成は、筆記具の判別ができその結果
を特徴抽出部３に出力できる構成であれば特に限定され
ない。例えば特開昭６２−１１２１２４号公報に開示の
様に塗りつぶしマークを用いた構成のものでも良い。Further, the writing instrument discriminating section 2 in the present embodiment has the first pattern register 1 as image data.
The character line pixels are extracted from the color image stored in No. 2 and the writing instrument is discriminated based on the difference in the R, G, and B value densities of the extracted pixels (the detailed description will be given. This will be described later in the description of the operation.). But,
The configuration of the writing instrument discriminating unit 2 is not particularly limited as long as the writing instrument can be discriminated and the result can be output to the feature extracting unit 3. For example, a structure using a filled mark as disclosed in JP-A-62-112124 may be used.

【００１４】また、特徴抽出部３は、この実施の形態で
は、図３に示す様に、第２パターンレジスタ３１と、線
幅計算部３２と、文字枠検出部３３と、４つのサブパタ
ーン抽出部３４、３５、３６、３７と、特徴マトリクス
抽出部３８とで構成してある。これら各構成成分の詳細
な説明は後の動作説明において行なうが、その概要は次
の様である。Further, in this embodiment, the feature extraction unit 3 has a second pattern register 31, a line width calculation unit 32, a character frame detection unit 33, and four sub-pattern extraction units, as shown in FIG. It is composed of units 34, 35, 36 and 37 and a feature matrix extraction unit 38. A detailed description of each of these components will be given later in the description of the operation, and the outline thereof is as follows.

【００１５】先ず、第２パターンレジスタ３１は、前処
理部１の２値化部１３で２値化された文字パターン（以
下、説明の都合上「文字パターン」と称する。）を格納
するためのものである。なお、２値化部１３では、あら
かじめ記憶したフォーマット情報に基づいて第１のパタ
ーンレジスタ１２に格納された多値画像に対しＲ、Ｇ、
Ｂのそれぞれに適当な閾値を設けて、各画素のＲＧＢ値
のすべてがそれぞれ閾値以下なら黒画素（１）とし、そ
れ以外なら白画素（０）として２値化を行なうものとす
る。この第２パターンレジスタ３１は、上記第１パター
ンレジスタ１２と同様に、その上にＸ−Ｙ座標を仮想的
に設定しこの座標系で表される任意の位置の画素データ
を読み出すことができるものとしてある。また、線幅計
算部３２は、前記第２パターンレジスタ３１に格納され
ている文字パターンからその線幅を検出するためのもの
である。また、文字枠検出部３３は、前記第２パターン
レジスタ３１に格納されている文字パターンの文字枠を
検出するためのものである。また、サブパターン抽出部
３４〜３７は前記第２パターンレジスタ３１に格納され
ている文字パターンを複数の方向に走査し、各走査方向
成分毎のストローク成分を表す複数のサブパターンを抽
出するもので、この例では垂直、水平、右斜めおよび左
斜めの各走査方向毎のサブパターン抽出部で構成してあ
る。ただし、この実施例では、各サブパターン抽出部
は、詳細は後の動作説明において説明するが、各方向成
分毎のストローク成分（具体的には文字パターンにおけ
る黒ビットの量）の抽出量を筆記具判別部２の判別結果
に応じ調整する構成としてある。こうすることで、特徴
抽出部３で抽出する特徴量（この場合は特徴マトリク
ス）を、筆記具判別部２の判別結果に応じ異ならせてい
る。また、特徴マトリクス抽出部３８は、前記第２パタ
ーンレジスタ３１に格納された文字パターンに対応する
領域を各サブパターンについてｍ×ｎ（ｍ，ｎは１以上
の整数で、ｍ＝ｎを含む）の部分領域に分割し、前記各
サブパターン毎の各部分領域内の黒ビットの数と前記線
幅計算部３２で検出した線幅ｗとを用いて特徴量として
の特徴マトリクスを抽出するものである。First, the second pattern register 31 stores a character pattern binarized by the binarizing unit 13 of the preprocessing unit 1 (hereinafter referred to as "character pattern" for convenience of description). It is a thing. The binarization unit 13 uses R, G, and R for the multivalued image stored in the first pattern register 12 based on the format information stored in advance.
An appropriate threshold value is provided for each of B, and if all the RGB values of each pixel are equal to or less than the threshold value, a black pixel (1) is set, and if not, a white pixel (0) is set and binarization is performed. Similar to the first pattern register 12, the second pattern register 31 is capable of virtually setting XY coordinates on the second pattern register 31 and reading out pixel data at an arbitrary position represented by this coordinate system. There is. The line width calculator 32 is for detecting the line width from the character pattern stored in the second pattern register 31. Further, the character frame detection unit 33 is for detecting the character frame of the character pattern stored in the second pattern register 31. The sub-pattern extracting units 34 to 37 scan the character pattern stored in the second pattern register 31 in a plurality of directions and extract a plurality of sub-patterns representing stroke components for each scanning direction component. In this example, the sub-pattern extraction unit is configured for each of the vertical, horizontal, right diagonal, and left diagonal scanning directions. However, in this embodiment, each sub-pattern extraction unit, the details of which will be described later in the description of the operation, determines the extraction amount of the stroke component (specifically, the amount of black bits in the character pattern) for each direction component with a writing instrument. The configuration is adjusted according to the determination result of the determination unit 2. By doing so, the feature amount (feature matrix in this case) extracted by the feature extraction unit 3 is made different according to the determination result of the writing instrument determination unit 2. Further, the feature matrix extraction unit 38 sets the area corresponding to the character pattern stored in the second pattern register 31 for each sub-pattern m × n (m and n are integers of 1 or more, including m = n). And a line width w detected by the line width calculation unit 32 is used to extract a feature matrix as a feature amount. is there.

【００１６】次に、この第一発明の実施の形態の文字認
識装置についてその動作と共にさらに説明する。なお、
この説明を、図４に示した様な被読取媒体４１に対し文
字認識を行なう例を考えながら行なう。すなわち、それ
自体が白色で、その表面の一部に予め白色および黒色以
外の色例えばオレンジ色で印刷された文字記入枠４２を
有し、かつ、該文字記入枠４２内に片仮名の「ホ」の文
字４３が筆記具としてのボールペンまたは鉛筆のいずれ
かで書かれている被読取媒体４１である。Next, the character recognition apparatus according to the embodiment of the first invention will be further described together with its operation. In addition,
This description will be given while considering an example in which character recognition is performed on the medium 41 to be read as shown in FIG. That is, the character itself is white, and a part of the surface thereof has a character entry frame 42 preliminarily printed in a color other than white and black, for example, orange, and the katakana “e” is included in the character entry frame 42. Is a read medium 41 written with either a ballpoint pen or a pencil as a writing tool.

【００１７】先ず、被読取媒体４１を走査して被読取媒
体４１からイメージデータとして色情報を持つカラー画
像（多値画像ともいう。）を得る。これは公知の種々の
方法で行なえるが、例えば以下に図５を用い説明する方
法で行なう。被読取媒体４１を例えばカラースキャナ等
により主走査方向５１および副走査方向５２に光学的に
走査してＲ、Ｇ、Ｂ３色の画素データを持つイメージを
順次に第１パターンレジスタ１２に取り込むことで、第
１パターンレジスタ１２に多値画像を入力する。First, the medium 41 to be read is scanned to obtain a color image (also referred to as a multi-valued image) having color information as image data from the medium 41 to be read. This can be performed by various known methods, for example, by the method described below with reference to FIG. By optically scanning the medium to be read 41 in the main scanning direction 51 and the sub-scanning direction 52 by a color scanner or the like, an image having pixel data of R, G, and B colors is sequentially loaded into the first pattern register 12. , A multi-valued image is input to the first pattern register 12.

【００１８】第１パターンレジスタ１２に多値画像が入
力されると、筆記具判別部２は、この第１パターンレジ
スタ１２から文字を構成する画素の抽出を行い、さらに
該抽出された画素のＲ、Ｇ、Ｂ値に基づいて筆記具の判
別を行う。これら文字を構成する画素の抽出およびＲ、
Ｇ、Ｂ値に基づく筆記具の判別を、この実施例では以下
に説明する様に行う。先ず、文字を構成する画素の抽出
は、文字４３、文字記入枠４２および被読取媒体４１の
色がそれぞれ黒色、オレンジ色および白色というように
互いに違うことを利用して行う。すなわち、黒色で書か
れた文字４３の部分を構成している画素が持つＲ、Ｇ、
Ｂ値は、一般に、文字記入枠４２の部分や被読取媒体４
１の部分を構成している画素のそれよりすべて小さい値
になるので、Ｒ、Ｇ、Ｂ値すべてが所定閾値より小さく
なっている画素を文字を構成する画素として抽出する。
また、筆記具の種類の判別すなわちここでは文字がボー
ルペンで書かれたのか鉛筆で書かれたのかの判別は、次
の様に行う。先ず、Ｒ、Ｇ、Ｂの濃度を横軸にとり、頻
度を縦軸にとった座標系（以下、「濃度−頻度座標系」
という）を考える。そして、文字を構成する画素である
として抽出された画素についてのＲ、Ｇ、Ｂのそれぞれ
の濃度に着目し上記濃度−頻度座標系におけるそれら濃
度の出現頻度に１を累積する。このようにして頻度によ
るヒストグラムを作ると、Ｒ、Ｇ、Ｂ各々の３つのピー
クを持つ濃度分布が得られる。この濃度分布は、鉛筆で
書かれた文字についてのものとボールペンで書かれた文
字についてのものとで異なる。したがって、この濃度分
布の違いにより筆記具が鉛筆なのかボールペンなのかを
判別する。なお、Ｒ、Ｇ、Ｂ値に基づき文字を構成する
画素を抽出する方法や筆記具の種類を判別する上記方法
はこの出願の出願人に係る特願平６−２０９８０５号に
詳しく述べられている。When a multi-valued image is input to the first pattern register 12, the writing instrument discriminating section 2 extracts pixels constituting a character from the first pattern register 12, and further, R of the extracted pixels, The writing instrument is discriminated based on the G and B values. Extraction of pixels that make up these characters and R,
In this embodiment, the writing instrument is discriminated based on the G and B values as described below. First, the extraction of pixels forming a character is performed by utilizing the fact that the colors of the character 43, the character entry frame 42, and the medium to be read 41 are different from each other, such as black, orange, and white, respectively. That is, R, G, which the pixels forming the portion of the character 43 written in black have,
Generally, the B value is the portion of the character entry frame 42 or the medium 4 to be read.
Since all of the pixels forming part 1 have smaller values than those of the pixels, the pixels whose R, G, and B values are all smaller than the predetermined threshold value are extracted as the pixels forming the character.
In addition, the type of writing instrument, that is, whether the character is written by a ballpoint pen or a pencil here is determined as follows. First, a coordinate system in which the abscissa represents the concentrations of R, G, and B and the ordinate represents the frequency (hereinafter, “concentration-frequency coordinate system”).
That)). Then, paying attention to the respective densities of R, G, and B with respect to the pixels extracted as the pixels constituting the character, 1 is accumulated in the appearance frequency of those densities in the density-frequency coordinate system. When the frequency histogram is created in this manner, a concentration distribution having three peaks of R, G, and B can be obtained. This concentration distribution is different for a character written with a pencil and a character written with a ballpoint pen. Therefore, it is determined whether the writing instrument is a pencil or a ballpoint pen based on the difference in the density distribution. The method for extracting the pixels forming a character based on the R, G, B values and the above method for determining the type of writing instrument are described in detail in Japanese Patent Application No. 6-209805 filed by the applicant of the present application.

【００１９】筆記具判別部２により筆記具の判別が終了
すると、特徴抽出部３は、該判別結果である筆記具の種
類を受け取る。また、前処理部１の２値化部１３は、文
字認識に必要とされる文字パターンを得るために、第１
パターンレジスタ１２に格納されている多値画像に対し
２値化処理をして文字パターンを得、これを第２パター
ンレジスタ３１（図３）に入力する。なお、ここでの２
値化処理は、既に説明したとおり、多値画像に対しＲ、
Ｇ、Ｂのそれぞれに適当な閾値を設けて、各画素のＲＧ
Ｂ値のすべてがそれぞれ閾値以下なら黒画素（１）と
し、それ以外なら白画素（０）として行なうものとす
る。しかし２値化処理の方法はこれに限定されず任意の
方法とできる。When the writing implement discriminating unit 2 completes the discrimination of the writing implement, the feature extracting unit 3 receives the type of the writing implement which is the discrimination result. In addition, the binarization unit 13 of the preprocessing unit 1 performs the first conversion in order to obtain a character pattern required for character recognition.
The multi-valued image stored in the pattern register 12 is binarized to obtain a character pattern, which is input to the second pattern register 31 (FIG. 3). In addition, 2 here
As described above, the binarization process is performed on the multivalued image with R,
RG of each pixel is set by setting an appropriate threshold for each of G and B.
If all the B values are less than or equal to the threshold value, the black pixel (1) is set, and otherwise, the white pixel (0) is set. However, the method of binarization processing is not limited to this and can be any method.

【００２０】文字パターンが第２パターンレジスタ３１
に格納されると、線幅計算部３２（図３）は、この文字
パターンの線幅を検出する。これをこの実施の形態では
次の様に行う。線幅計算部３２は、周知のフィルタ回路
と同様にシフトレジスタ構成となっており、この文字パ
ターンについて２×２の窓のすべてが黒ビットとなる点
の総数Ｑと全黒ビット数Ａとを計数し、そして下記の周
知の近似式（１）を用いて線幅ｗを計算する。The character pattern is the second pattern register 31.
When stored in, the line width calculator 32 (FIG. 3) detects the line width of this character pattern. This is performed as follows in this embodiment. The line width calculation unit 32 has a shift register configuration like a well-known filter circuit, and for this character pattern, the total number Q of points where all 2 × 2 windows are black bits and the total number A of black bits are set. Count and calculate the line width w using the well known approximation formula (1) below.

【００２１】ｗ＝１／｛１−（Ｑ／Ａ）｝・・・（１）また、垂直サブパターン抽出部３４は、第２パターンレ
ジスタ３１について垂直走査を行い、線幅計算部の結果
ｗをパラメータとして、走査方向の黒ビットの連続する
長さＳが下記の関係式（２）を満足しない黒ビット部分
を削除することにより、垂直サブパターン（以下、「Ｖ
ＳＰ」と略称することもある。）を得る。W = 1 / {1- (Q / A)} (1) Further, the vertical sub-pattern extraction unit 34 performs a vertical scan on the second pattern register 31, and the result w of the line width calculation unit is obtained. Is used as a parameter, the vertical sub-pattern (hereinafter referred to as “V”) is deleted by deleting a black bit portion whose continuous length S of black bits in the scanning direction does not satisfy the following relational expression (2).
It may be abbreviated as "SP". ) Get.

【００２２】Ｓ≧Ｎ・ｗ・・・（２）なお、（２）式中のＮは筆記具判別部２で判別された筆
記具の種類に応じ決められる値である。この実施の形態
では、各サブパターン抽出部３４〜３７の各々が、筆記
具判別部２から入力された筆記具の種類を示す情報に基
づきこのＮを選択する構成としている。選択するべきＮ
は例えば好適なテーブルＲＯＭに記憶しておきこれを読
み出す方法などで行える。具体的には、かすれ文字が多
くなり易い筆記具の場合はＮを小さい値に選択する。ま
た、つぶれ文字が多くなり易い筆記具の場合はＮを大き
い値に選択する。このようにＮを筆記具の種類に応じ変
えるため、被読取媒体に文字を記入した際に用いた筆記
具の種類によって発生するかすれやすぶれ等の文字の変
形を吸収した安定したサブパターンが得られる。S ≧ N · w (2) It should be noted that N in the equation (2) is a value determined according to the type of the writing instrument determined by the writing instrument determination unit 2. In this embodiment, each of the sub-pattern extraction units 34 to 37 is configured to select this N based on the information indicating the type of writing instrument input from the writing instrument determination unit 2. N to choose
Can be stored in a suitable table ROM and read out. Specifically, in the case of a writing instrument which tends to have many blurred characters, N is selected to a small value. Also, in the case of a writing instrument which tends to have many collapsed characters, N is set to a large value. Since N is changed according to the type of writing instrument as described above, it is possible to obtain a stable sub-pattern that absorbs deformation of characters such as blurring or blurring that occurs depending on the type of writing instrument used when writing characters on the medium to be read.

【００２３】上記垂直サブパターン３４の動作と同様
に、水平サブパターン抽出部３５は水平走査により水平
サブパターン（ＨＳＰ）を、右斜めサブパターン抽出部
３６は右斜め４５°走査によって右斜めサブパターン
（ＲＳＰ）を、左斜めサブパターン抽出部３７は左斜め
４５°走査によって左斜めサブパターン（ＬＳＰ）をそ
れぞれ抽出する。片仮名「ホ」の文字パターンから上記
の様に抽出されるＶＳＰ、ＨＳＰ、ＲＳＰおよびＬＳＰ
を、この「ホ」の文字パターンと共に図６（Ａ）〜
（Ｅ）にそれぞれ示した。Similar to the operation of the vertical sub-pattern 34, the horizontal sub-pattern extraction unit 35 scans a horizontal sub-pattern (HSP) by horizontal scanning, and the right diagonal sub-pattern extraction unit 36 scans a right diagonal 45 ° by a right diagonal sub-pattern. The left diagonal sub-pattern extracting unit 37 extracts the left diagonal sub-pattern (LSP) by scanning the left diagonal 45 °. VSP, HSP, RSP and LSP extracted as above from the character pattern of katakana "e"
Together with the character pattern of "e" in FIG.
Each is shown in (E).

【００２４】また、文字枠検出部３３は第２パターンレ
ジスタ３１内の文字パターンに外接する文字枠に対応す
る領域を検出しこの結果を特徴マトリクス抽出部３８に
送る。この場合の検出された文字枠は図６（Ａ）〜
（Ｅ）において破線で示す四角となる。なお、文字枠検
出は、例えば、第２パターンレジスタ３１内を走査して
黒ビットが出現する境界枠を検出することで行える。Further, the character frame detecting section 33 detects the area corresponding to the character frame circumscribing the character pattern in the second pattern register 31 and sends the result to the feature matrix extracting section 38. The detected character frame in this case is shown in FIG.
In (E), it becomes a square indicated by a broken line. The character frame can be detected, for example, by scanning the inside of the second pattern register 31 and detecting the boundary frame in which the black bit appears.

【００２５】また、特徴マトリクス抽出部３８は、第２
パターンレジスタ３１における文字枠に対応する領域を
垂直、水平、右斜め、左斜めの各サブパターンについて
ｍ×ｎの領域に分割する。図７はｍ＝ｎ＝５とし、図６
（Ｂ）に示した上記「ホ」の垂直サブパターンを分割し
た例である。さらに、特徴マトリクス抽出部３８は、こ
のように分割した各領域の黒ビット数Ｂ_ijをそれぞれ計
数し、そして上記線幅ｗを使用して下記の（３）式によ
り線長を示す特徴Ｌ_ijを計算して、ここではｍ×ｎ×４
次元の特徴マトリクスを作成する。ここで、Ｂ_ij、Ｌ_ij
におけるｉは特徴マトリクスの各構成要素における水平
方向の要素、ｊは同じく垂直方向の要素を示す記号であ
る。Further, the feature matrix extraction unit 38 uses the second
The region corresponding to the character frame in the pattern register 31 is divided into m × n regions for each of the vertical, horizontal, right diagonal, and left diagonal sub patterns. In FIG. 7, m = n = 5.
This is an example in which the vertical sub-pattern of "e" shown in FIG. Further, the feature matrix extraction unit 38 counts the number of black bits B _ij of each region thus divided, and uses the line width w to _{express the} feature L _ij indicating the line length by the following equation (3). Is calculated, and here, m × n × 4
Create a dimensional feature matrix. Where B _ij and L _ij
In the above, i is a symbol indicating a horizontal element in each component of the feature matrix, and j is a symbol indicating a vertical element.

【００２６】Ｌ_ij＝Ｂ_ij／ｗ・・・（３）なお、このように得られた特徴マトリクスに対し一般に
は正規化処理を行い、最終的なｍ×ｎ×４の正規化特徴
マトリクス（すなわち文字パターンの特徴量）を得る。
この正規化は、典型的には、ＶＳＰ特徴マトリクスにつ
いては文字枠のｙ軸方向の長さΔＹで、ＨＳＰ特徴マト
リクスについては同ｘ軸方向の長さΔＸで、ＲＳＰおよ
びＬＳＰの各特徴マトリクスについては｛（ΔＸ）² ＋
（ΔＹ）² ｝^1/2 でそれぞれ正規化することで行なえ
る。L _ij = B _ij / w (3) It should be noted that the feature matrix thus obtained is generally subjected to normalization processing to obtain a final m × n × 4 normalized feature matrix ( That is, the feature amount of the character pattern) is obtained.
This normalization is typically performed for the VSP feature matrix with the y-axis length ΔY of the character frame, for the HSP feature matrix with the x-axis length ΔX, and for each of the RSP and LSP feature matrices. Is {(ΔX) ² +
This can be done by normalizing with (ΔY) ² } ^1/2 .

【００２７】また識別部４は、予め用意されている複数
の標準文字パターンの特徴量いわゆる標準文字マスク
（ｆｍ）と、上記特徴マトリクス抽出部３８で作成した
特徴マトリクス（ｆｉ）との間に、周知の下記（４）式
で与えられる距離Ｄ、すなわちｍ×ｎ×４次元特徴空間
における２つのベクトルの差分ベクトルの長さが最少の
値を与える標準マスクのカテゴリー名を文字名出力とし
て出力する。The discriminating unit 4 is arranged between the characteristic amount of a plurality of standard character patterns prepared in advance, so-called standard character mask (fm), and the characteristic matrix (fi) created by the characteristic matrix extracting unit 38. The distance D given by the well-known formula (4), that is, the category name of the standard mask that gives the smallest value of the difference vector length of two vectors in the m × n × 4 dimensional feature space is output as the character name output. .

【００２８】Ｄ＝｛Σ（ｆｍ−ｆｉ）² ｝^1/2 ・・・（４）上述の説明から明らかな様に、この第一発明では、特徴
抽出部で文字パターンの特徴を抽出する際に筆記具の種
類の判別結果を考慮した処理がなされるので、筆記具の
種類に起因するかすれやつぶれ等を除去した安定した特
徴量が抽出出来る。このため、その後の認識処理を従来
通り行なっても、筆記具が異なることに起因する認識率
の低下は従来より軽減される。D = {Σ (fm-fi) ² } ^1/2 (4) As is apparent from the above description, according to the first aspect of the invention, when the feature of the character pattern is extracted by the feature extracting unit. In addition, since the processing is performed in consideration of the determination result of the type of writing instrument, it is possible to extract a stable feature amount in which blurring or crushing caused by the type of writing instrument is removed. For this reason, even if the subsequent recognition processing is performed as usual, the reduction in the recognition rate due to the difference in the writing instrument is reduced.

【００２９】２．第二発明の説明上述の第一発明では、被読取媒体に文字を記入した際に
用いた筆記具の種類に起因するかすれやつぶれを特徴抽
出部３において補正していたが、これをぼけ処理におい
て行なうことも出来る。この第二発明はその例である。2. Description of the Second Invention In the first invention described above, the feature extraction unit 3 corrects the blurring and the crushing due to the type of the writing instrument used when writing characters on the medium to be read. You can also do it. This second invention is an example.

【００３０】図８は第二発明の実施の形態の文字認識装
置の構成を示すブロック図である。この第二発明の実施
の形態の文字認識装置は、前処理部１と、筆記具判別部
２と、特徴抽出部３ａと、ぼけ処理部５と、識別部４と
を具える。これら構成成分のうち、前処理部１と、筆記
具判別部２と、識別部４は、それぞれ、第一発明の実施
の形態の文字認識装置の各部と同様なもので構成出来
る。ただし、筆記具判別部２は筆記具判別結果をぼけ処
理部５に出力する。そこで、第一発明の文字認識装置と
相違する構成成分である特徴抽出部３ａおよびぼけ処理
部５についての構成を以下に説明する。FIG. 8 is a block diagram showing the configuration of the character recognition device according to the second embodiment of the present invention. The character recognition device according to the second embodiment of the present invention includes a preprocessing unit 1, a writing instrument determination unit 2, a feature extraction unit 3a, a blur processing unit 5, and an identification unit 4. Of these constituent components, the preprocessing unit 1, the writing instrument determination unit 2, and the identification unit 4 can be configured by the same units as the respective units of the character recognition device according to the embodiment of the first invention. However, the writing instrument determination unit 2 outputs the writing instrument determination result to the blur processing unit 5. Therefore, the configuration of the feature extraction unit 3a and the blurring processing unit 5 which are constituent components different from those of the character recognition device of the first invention will be described below.

【００３１】先ず、第二発明における特徴抽出部３ａ
は、図９に示した様に、第２パターンレジスタ３１、線
幅計算部３２、文字枠検出部３３、垂直サブパターン抽
出部３４ａ、水平サブパターン抽出部３５ａ、右斜めサ
ブパターン抽出部３６ａ、左斜めサブパターン抽出部３
７ａおよび特徴マトリクス抽出部３８で構成してある。
これらのうち、第２パターンレジスタ３１、線幅計算部
３２、文字枠検出部３３および特徴マトリクス抽出部３
８は、第一発明における各部と同様な構成としてある。
一方各サブパターン抽出部３４ａ〜３７ａ各々について
は、第一発明では筆記具判別部２から筆記具の種類を示
す情報を各サブパターン抽出部に入力し、これに基づい
て黒ビットの抽出を制御していたのに対し、この第二発
明ではこれをしていない。すなわち、各サブパターン抽
出部３４ａ〜３７ａは、各走査方向の黒ビットの連続す
る長さＳが線幅ｗに対し下記の関係（５）を満足しない
黒ビット部分を削除することにより、各方向のサブパタ
ーンＶＳＰ、ＨＳＰ、ＲＳＰ、ＬＳＰを得る。ただし、
Ｔは予め定められた定数である。First, the feature extraction unit 3a in the second invention.
9, as shown in FIG. 9, the second pattern register 31, the line width calculation unit 32, the character frame detection unit 33, the vertical sub-pattern extraction unit 34a, the horizontal sub-pattern extraction unit 35a, the right diagonal sub-pattern extraction unit 36a, Left diagonal sub-pattern extraction unit 3
7a and a characteristic matrix extraction unit 38.
Of these, the second pattern register 31, the line width calculation unit 32, the character frame detection unit 33, and the feature matrix extraction unit 3
8 has the same structure as each part in the first invention.
On the other hand, with respect to each of the sub-pattern extraction units 34a to 37a, in the first invention, the information indicating the type of the writing tool is input from the writing tool determination unit 2 to each sub-pattern extraction unit, and black bit extraction is controlled based on the information. On the other hand, this second invention does not do this. That is, each of the sub-pattern extraction units 34a to 37a deletes a black bit portion in which the continuous length S of black bits in each scanning direction does not satisfy the following relationship (5) with respect to the line width w, thereby removing each direction. Subpatterns VSP, HSP, RSP, LSP of However,
T is a predetermined constant.

【００３２】Ｓ≧Ｔ・ｗ・・・（５）したがって、この場合に抽出される各サブパターンは、
筆記具の違いに起因する特徴を含むものとなる。これは
後のぼけ処理部で補正されることになる。S ≧ T · w (5) Therefore, each sub-pattern extracted in this case is
It will include features due to differences in writing instruments. This will be corrected in the blur processing unit later.

【００３３】また、特徴抽出部３ａの特徴マトリクス抽
出部３８（図９）は、第一発明の場合と同様に各サブパ
ターン毎の正規化特徴マトリクスを作成する。Further, the feature matrix extraction unit 38 (FIG. 9) of the feature extraction unit 3a creates a normalized feature matrix for each sub-pattern as in the case of the first invention.

【００３４】また、ぼけ処理部５は前記筆記具判別部２
で得られた判別結果に応じ前記文字パターンの特徴量
（この場合は特徴抽出部３ａで得られた正規化特徴マト
リクス）に施すぼけ処理の程度を異ならせる構成のもの
である。このためこの実施の形態の場合のぼけ処理部５
は、次のような動作をする構成のものとしてある。The blur processing unit 5 is the writing instrument discrimination unit 2 described above.
The degree of blurring processing applied to the feature amount of the character pattern (in this case, the normalized feature matrix obtained by the feature extraction unit 3a) is varied according to the determination result obtained in (1). Therefore, the blur processing unit 5 in the case of this embodiment
Has a configuration that operates as follows.

【００３５】すなわちぼけ処理部５は、第二発明におけ
る特徴抽出部３ａで抽出された正規化特徴マトリクスに
対しこの実施の形態では次に示すような方法でサブパタ
ーンの抽出方向に対してぼけ処理をする。That is, the blurring processing unit 5 performs blurring processing on the normalized feature matrix extracted by the feature extracting unit 3a in the second invention in the subpattern extraction direction by the following method in this embodiment. do.

【００３６】まず、垂直サブパターン抽出部３４ａで抽
出された垂直サブパターンに基づいて特徴マトリクス抽
出部３８が作成した垂直特徴マトリクスの要素Ｍ_ijにつ
いては、次の（６）式を適用してぼけ特徴マトリクスの
要素Ｍ_Bij を計算する。ここで、Ｍ_ij、Ｍ_Bij における
ｉは，特徴マトリクスの各構成要素における水平方向の
要素、ｊは同じく垂直方向の要素を示す記号である。ま
た、Ｃは、筆記具判別部２の判別結果に応じ決められる
値である（以下の（７）〜（９）式において同じ。）。
具体的には、かすれ文字が多くなり易い場合はＣを小さ
くしてぼけの度合いを大きくする。また、つぶれ文字が
多くなり易い場合はＣを大きくしてぼけの度合を小さく
する。なお、要素Ｍ_ijのうち最も外側の領域について計
算する場合は、さらに外側の領域の要素（Ｍ_i+1,j とか
Ｍ_i,j+1 ）はこの最も外側の要素に等しいものとみなす
（以下の（７）〜（９）式において同じ。）。すなわ
ち、上記各式右辺における特徴マトリクスの各要素をＭ
_p,q と総称するとしたときその添字ｐ，ｑについて、ｐ
＞ｍまたはｑ＞ｎ（ｍ，ｎは上記分割領域を決めた
値。）またはｐ＜１またはｑ＜１が成立する場合はＭ_pq
＝Ｍ_ijとする（以下の（７）〜（９）式において同
じ。）。First, for the elements M _ij of the vertical feature matrix created by the feature matrix extractor 38 based on the vertical sub-patterns extracted by the vertical sub-pattern extractor 34a, the following equation (6) is applied to blur. Compute the element M _{Bij of the} feature matrix. Here, i in M _ij and M _Bij is a symbol indicating a horizontal element in each component of the feature matrix, and j is a symbol indicating a vertical element. Further, C is a value determined according to the discrimination result of the writing instrument discriminating unit 2 (the same applies to the following equations (7) to (9)).
Specifically, when the number of faint characters tends to increase, C is reduced to increase the degree of blurring. When the number of characters that are blurred is likely to increase, C is increased to reduce the degree of blurring. When calculating the outermost area of the elements M _ij , the elements (M _{i + 1, j} or M _{i, j + 1} ) in the further outer area are regarded as equal to the outermost element ( The same applies to the following expressions (7) to (9). That is, each element of the feature matrix on the right side of each of the above equations is set to M
_p, the subscript p when you want to collectively referred to as _q, for q, p
> M or q> n (m and n are values that determine the above-mentioned divided areas) or p <1 or q <1 is satisfied, M _pq
= M _ij (same in the following expressions (7) to (9)).

【００３７】[0037]

【数１】 [Equation 1]

【００３８】この（６）式は、注目する垂直マトリクス
要素値Ｍ_ijとそれに水平方向で隣接する要素値Ｍ_i-1,j
およびＭ_i+1,j とを用いて、重みづけした平均値でぼか
すことを意味する。ここで、各要素値は、垂直サブパタ
ーンの各領域における長さと対応づけられる量なので、
結局この（６）式は各要素の長さを水平方向に再配分す
るものと考えて良い。This equation (6) is used for the vertical matrix element value M _{ij of} interest and the element values M _{i-1, j} adjacent to it in the horizontal direction.
And M _{i + 1, j} are used to mean blurring with a weighted average value. Here, each element value is an amount associated with the length in each region of the vertical sub-pattern, so
After all, it can be considered that the expression (6) redistributes the length of each element in the horizontal direction.

【００３９】同様に、水平サブパターン抽出部３５ａで
抽出された水平サブパターンに基づいて特徴マトリクス
抽出部３８が作成した水平特徴マトリクスの要素Ｍ_ijに
ついて、次の（７）式を適用してぼけ特徴マトリクスの
要素Ｍ_Bij を計算する。この（７）式は、水平特徴マト
リクスの各要素の長さを垂直方向に再配分するものと考
えて良い。Similarly, for the elements M _ij of the horizontal feature matrix created by the feature matrix extractor 38 based on the horizontal sub-patterns extracted by the horizontal sub-pattern extractor 35a, the following equation (7) is applied to blur. Compute the element M _{Bij of the} feature matrix. It can be considered that the expression (7) redistributes the length of each element of the horizontal feature matrix in the vertical direction.

【００４０】[0040]

【数２】 [Equation 2]

【００４１】同様に、右斜めサブパターン抽出部３６ａ
で抽出された右斜めサブパターンに基づいて特徴マトリ
クス抽出部３８が作成した右斜め特徴マトリクスの要素
Ｍ_ijについて、次の（８）式を適用してぼけ特徴マトリ
クスの要素Ｍ_Bij を計算する。この（８）式は、右斜め
特徴マトリクスの各要素の長さを左斜め方向に再配分す
るものと考えて良い。Similarly, the right diagonal sub-pattern extraction section 36a
The element M _Bij of the blur feature matrix is calculated by applying the following equation (8) to the element M _ij of the right diagonal feature matrix created by the feature matrix extracting unit 38 on the basis of the right diagonal sub-pattern extracted in. It can be considered that the equation (8) redistributes the length of each element of the right diagonal feature matrix in the left diagonal direction.

【００４２】[0042]

【数３】 (Equation 3)

【００４３】同様に、左斜めサブパターン抽出部３７ａ
で抽出された左斜めサブパターンに基づいて特徴マトリ
クス抽出部３８が作成した左斜め特徴マトリクスの要素
Ｍ_ijについて、次の（９）式を適用してぼけ特徴マトリ
クスの要素Ｍ_Bij を計算する。この（９）式は、左斜め
特徴マトリクスの各要素の長さを右斜め方向に再配分す
るものと考えて良い。Similarly, the left diagonal sub-pattern extracting section 37a
With respect to the element M _ij of the left diagonal feature matrix created by the feature matrix extracting unit 38 based on the left diagonal sub-pattern extracted in step (4), the following equation (9) is applied to calculate the element M _Bij of the blur feature matrix. It can be considered that the expression (9) redistributes the length of each element of the left diagonal feature matrix in the right diagonal direction.

【００４４】[0044]

【数４】 (Equation 4)

【００４５】このようにＣをパラメータとして筆記具の
種類に応じてぼけ特徴マトリクス要素を変えるので、文
字パターンを被読取媒体に記入した際の筆記具の種類に
よって発生するかすれやつぶれ等の変形をこのぼけ処理
で吸収でき、したがって安定したぼけ特徴マトリクスが
得られる。その後は、識別部４がこのぼけ特徴マトリク
スを予め用意されている複数の標準文字パターンの特徴
量と照合して、前記文字パターンを識別する処理を行
う。これについては、基本的には第一発明と同様である
から、その説明を省略する。As described above, since the blur feature matrix element is changed according to the type of writing instrument with C as a parameter, deformation such as blurring or crushing that occurs depending on the type of writing instrument when a character pattern is written on a medium to be read is blurred. The process can be absorbed and thus a stable blur feature matrix is obtained. After that, the identification unit 4 compares the blur feature matrix with the feature amounts of a plurality of standard character patterns prepared in advance, and identifies the character pattern. This is basically the same as in the first aspect of the invention, so description thereof will be omitted.

【００４６】上述においてはこの出願の各発明の実施の
形態について説明したがこの発明は上述の形態に限られ
ない。Although the embodiments of the inventions of this application have been described above, the invention is not limited to the above-mentioned embodiments.

【００４７】例えば上述の各実施の形態では筆記具を鉛
筆及びボールペンとした例を説明したがこの発明は他の
任意の筆記具例えばサインペン等にも適用できる。ま
た、上述の実施の形態では入力イメージからの文字を構
成する画素の抽出および筆記具の判別をいずれも画素の
Ｒ，Ｇ，Ｂの色情報に基づいて行っていたが、例えば色
相、彩度或は明度の違いに着目して上記画素の抽出や筆
記具の判別を行うこともできる。また、もちろん、被読
取媒体がカラー画像ではなくモノクロ画像（モノクロの
イメージデータ）である場合もこれら発明は適用でき
る。For example, in each of the above-described embodiments, an example in which the writing instrument is a pencil or a ballpoint pen has been described, but the present invention can be applied to any other writing instrument, such as a felt-tip pen. Further, in the above-described embodiment, the extraction of the pixels forming the character from the input image and the determination of the writing tool are both performed based on the color information of R, G, B of the pixel. Can also perform the above-mentioned pixel extraction and writing instrument discrimination by paying attention to the difference in brightness. Of course, the present invention can be applied to the case where the medium to be read is not a color image but a monochrome image (monochrome image data).

【００４８】[0048]

【発明の効果】上述した説明から明らかなように、この
出願の第一発明の文字認識装置によれば、筆記具判別部
と特徴抽出部と識別部とを具える文字認識装置におい
て、前記特徴抽出部を、前記筆記具判別部によって得ら
れる判別結果に応じ抽出する特徴量を異ならせる構成の
ものとしたので、筆記具の種類の違いに起因するかすれ
やつぶれ等の除去が可能な特徴量抽出ができ、かつ、こ
の特徴量を用いての認識処理が行える。また、第二発明
の文字認識装置によれば、筆記具判別部と特徴抽出部と
ぼけ処理部と識別部とを具える文字認識装置において、
前記ぼけ処理部を、前記筆記具判別部で得られた判別結
果に応じ前記文字パターンの特徴量に施すぼけ処理の程
度を異ならせる構成のものとしたので、筆記具の種類の
違いに起因するかすれやつぶれ等の除去が可能なぼけ処
理ができ、かつ、これで得られるぼけ特徴マトリクスを
用いての認識処理が行える。このため、これら第一およ
び第二発明では、被読取媒体へ文字などを記入する際に
用いる筆記具の種類が異なることでエラーやリジェクト
になっていた文字パターンを正読できるので、認識精度
が従来より高い文字認識装置が実現できる。As is apparent from the above description, according to the character recognition device of the first invention of the present application, in the character recognition device having the writing instrument discrimination part, the feature extraction part and the identification part, the feature extraction is performed. Since the part is configured to have a different feature amount to be extracted according to the determination result obtained by the writing instrument determination unit, it is possible to extract a feature amount that can remove blurring or crushing due to the difference in the type of writing instrument. In addition, recognition processing can be performed using this feature amount. Further, according to the character recognition device of the second invention, in a character recognition device comprising a writing tool discrimination unit, a feature extraction unit, a blur processing unit and an identification unit,
Since the blur processing unit is configured to change the degree of blur processing performed on the feature amount of the character pattern according to the determination result obtained by the writing instrument determination unit, blurring caused by a difference in writing instrument type or Blurring processing capable of removing crushing and the like can be performed, and recognition processing can be performed using the blur feature matrix obtained thereby. Therefore, in these first and second inventions, it is possible to correctly read the character pattern that has been an error or rejected due to the difference in the type of writing instrument used when writing characters or the like on the medium to be read. A higher character recognition device can be realized.

[Brief description of drawings]

【図１】第一発明の実施の形態の文字認識装置の説明図
である。FIG. 1 is an explanatory diagram of a character recognition device according to an embodiment of the first invention.

【図２】第一、第二発明の実施の形態の文字認識装置に
おける前処理部の構成例の説明図である。FIG. 2 is an explanatory diagram of a configuration example of a preprocessing unit in the character recognition device of the first and second embodiments.

【図３】第一発明の実施の形態の文字認識装置における
特徴抽出部の構成例の説明図である。FIG. 3 is an explanatory diagram of a configuration example of a feature extraction unit in the character recognition device according to the embodiment of the first invention.

【図４】実施の形態における被読取媒体の説明図であ
る。FIG. 4 is an explanatory diagram of a medium to be read according to the embodiment.

【図５】パターンレジスタにおける処理の説明図であ
る。FIG. 5 is an explanatory diagram of processing in a pattern register.

【図６】サブパターンの抽出例の説明図である。FIG. 6 is an explanatory diagram of an example of extracting a sub pattern.

【図７】文字枠の分割例の説明図である。FIG. 7 is an explanatory diagram of an example of dividing a character frame.

【図８】第二発明の実施の形態の文字認識装置の構成例
の説明図である。FIG. 8 is an explanatory diagram of a configuration example of a character recognition device according to an embodiment of the second invention.

【図９】第二発明の実施の形態の文字認識装置における
特徴抽出部の構成例の説明図である。FIG. 9 is an explanatory diagram of a configuration example of a feature extraction unit in the character recognition device according to the second embodiment of the invention.

[Explanation of symbols]

１：前処理部２：筆記具判別部３：特徴抽出部３ａ：第二発明における特徴抽出部４：識別部５：ぼけ処理部 1: Preprocessing part 2: Writing instrument discrimination part 3: Feature extraction part 3a: Feature extraction part in the second invention 4: Identification part 5: Blur processing part

Claims

[Claims]

1. A writing instrument discriminating unit for discriminating the type of writing instrument used when a character is written on a medium to be read based on image data of the medium to be read, and a preprocessing unit for obtaining a character pattern from the image data. A feature extraction unit for extracting the feature amount of the character pattern, and an identification for identifying the character pattern by collating the extracted feature amount of the character pattern with the feature amounts of a plurality of standard character patterns prepared in advance. A character recognition device including a section, the character extraction device being configured to change the feature amount to be extracted by the feature extraction unit according to a determination result obtained by the writing instrument determination unit.

2. The character recognition device according to claim 1, wherein the feature extraction unit includes a line width calculation unit that detects a line width w from the character pattern, and the character pattern is scanned in a plurality of directions. A group of black bits satisfying S ≧ N · w (N is a value determined according to the discrimination result in the writing instrument discriminating section) between the continuous length S of black bits in a plurality of scanning directions and the line width w. By extracting a plurality of sub-patterns representing stroke components for each of the plurality of scanning-direction components, and m × n (m, n is 1 for each sub-pattern extracted by the sub-pattern extraction unit. The above positive integers may be the same as each other.), And the feature quantity is calculated using the number of black bits and the line width in each partial area for each sub-pattern. Features as A character recognition device comprising a feature matrix extraction unit for creating a lix.

3. A writing instrument discriminating unit which discriminates the type of writing instrument used when a character is written on a medium to be read based on image data of the medium to be read, and a preprocessing unit which obtains a character pattern from the image data. , A feature extraction unit that extracts the feature amount of the character pattern, a blur processing unit that performs blur processing on the extracted feature amount of the character pattern, and a feature amount that has been subjected to the blur processing of the character pattern are prepared in advance. In a character recognition device comprising an identification unit that identifies the character pattern by comparing it with the characteristic amount of a plurality of standard character patterns, the blurring processing unit, in the determination result obtained by the writing instrument determination unit. The character recognition device is configured to change the degree of blurring processing applied to the characteristic amount of the character pattern accordingly.