JPS62262194A - Optical character reader - Google Patents

Optical character reader

Info

Publication number
JPS62262194A
JPS62262194A JP61103863A JP10386386A JPS62262194A JP S62262194 A JPS62262194 A JP S62262194A JP 61103863 A JP61103863 A JP 61103863A JP 10386386 A JP10386386 A JP 10386386A JP S62262194 A JPS62262194 A JP S62262194A
Authority
JP
Japan
Prior art keywords
character
pattern
projection
width
section
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP61103863A
Other languages
Japanese (ja)
Other versions
JPH0576674B2 (en
Inventor
Michio Terai
寺井 道夫
Shigeru Horii
堀井 茂
Yoshikazu Kobayashi
美和 小林
Kazuo Ito
伊藤 和郎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Oki Electric Industry Co Ltd
Original Assignee
Oki Electric Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oki Electric Industry Co Ltd filed Critical Oki Electric Industry Co Ltd
Priority to JP61103863A priority Critical patent/JPS62262194A/en
Publication of JPS62262194A publication Critical patent/JPS62262194A/en
Publication of JPH0576674B2 publication Critical patent/JPH0576674B2/ja
Granted legal-status Critical Current

Links

Abstract

PURPOSE:To reduce the rate of rejection, and erroneous recognition, and to improve working efficiency, by providing a dictionary memory in which the character pattern of a consecutive character is registered in advance, collating the consecutive character as it is, and recognizing it. CONSTITUTION:A reading object on a slip is read out at a reading part 1, and a picture signal binary-coded at a binarization part 2 is outputted. The picture signal is segmented by every character row at a row segmenting part 3, and at an image extracting part 4, a projection intersecting orthogonally with the row of the picture signal included in one character row, is extracted, and the projection is stored temporarily at a projection storing part 5. Also, at an isolated pattern segmenting part 6, the picture signal of a black part in the projection is segmented, and at a pattern width deciding part 7, a prescribed character width is compared with the width of an isolated pattern, and a decision whether the character is one character, or the consecutive character, is performed. And a single character pattern and a consecutive character pattern registered in advance at a single character dictionary memory 9, and a consecutive character dictionary memory 10, are processed at a dictionary switching part 13, and when it is the consecutive character, it is collated and recognized as it is at a deciding part 12, thereby, the rejection and the erroneous recognition can be reduced.

Description

【発明の詳細な説明】 (産業上の利用分野) 本発明は光学式文字読取装置に関し、特に帳票上に記載
された印刷文字を認識する方法に関する。
DETAILED DESCRIPTION OF THE INVENTION (Field of Industrial Application) The present invention relates to an optical character reading device, and more particularly to a method for recognizing printed characters written on a form.

(従来の技術) 第2図は従来の光学式読取装置を示すブロック図である
。同図において、21は帳票上の読取対象を光学的に読
取る読取部、22は読取部21により読取ったアナログ
信号又は16階調のディジタル信号をある閾値に基づい
て黒を11111.白を′0″として2値化して画像信
号を得る2値化部、23は2値化部22により得た画像
信号を文字行毎に切出す行切出部、24は行切出部23
によって切出された文字行に含まれる画像信号を行と直
交する方向に投影し、黒の有無をIt l ##、11
0 IIで表した射影を抽出する射影抽出部、25は射
影抽出部14により抽出した射影のデータを一時記憶す
る射影記憶部、26は射影抽出部24により抽出し、射
影記憶部25に記憶されている射影において左右を“0
″(白)に挟まれた連続する1”(黒)の射影部分に相
当する画像信号を孤立パターンとして切出す孤立パター
ン切出部、27は孤立パターン切出部26により切出さ
れた孤立パターンの幅を検出し、検出した幅を1文字と
みなすことができる限度を示す最大文字幅と比較して検
出した幅が最大文字幅以内のときは孤立パターン切出部
26で切り出した孤立パターンを後述する認識部29に
供給し、大きいときは後述する強制切出部28に供給す
るパターン幅判定部、28は孤立パターンを最大文字幅
で強制的に切出し。
(Prior Art) FIG. 2 is a block diagram showing a conventional optical reading device. In the figure, 21 is a reading unit that optically reads the object to be read on a form, and 22 is an analog signal or a 16-gradation digital signal read by the reading unit 21, and the black is 11111. A binarization unit that binarizes white as '0' to obtain an image signal; 23 is a line cutting unit that cuts out the image signal obtained by the binarization unit 22 for each character line; 24 is a line cutting unit 23
The image signal included in the character line cut out by is projected in the direction perpendicular to the line, and the presence or absence of black is determined by It l ##, 11
25 is a projection storage unit that temporarily stores the projection data extracted by the projection extraction unit 14; 26 is the projection extraction unit 24 that extracts the projection data extracted by the projection extraction unit 24; In the projection where the left and right sides are “0”
27 is an isolated pattern cut out by the isolated pattern cutting unit 26. , and compares the detected width with the maximum character width indicating the limit that can be considered as one character. If the detected width is within the maximum character width, the isolated pattern cut out by the isolated pattern cutting section 26 is A pattern width determining section supplies the pattern width to a recognition section 29, which will be described later, and if it is large, supplies it to a forced cutting section 28, which will mention later.The pattern width determining section 28 forcibly cuts out the isolated pattern with the maximum character width.

切出した文字パターンを後述する認識部29に供給する
強制切出部、29は孤立パターン切出部26又は強制切
出部2Bからの文字パターンを辞書メモリに予め登録し
てある文字パターンと照合して認識する認識部である。
A forced extraction section 29 supplies the extracted character pattern to a recognition section 29 (described later), which compares the character pattern from the isolated pattern extraction section 26 or the forced extraction section 2B with a character pattern registered in advance in a dictionary memory. This is the recognition unit that recognizes the

次に、第2図を用いて従来例の動作を説明する。Next, the operation of the conventional example will be explained using FIG.

先ず、読取部21では、帳票上の文字等の読取対象を読
取り、2値化部22で黒を111 P+、白を110 
IIとして2値化し、行切出部23で文字行毎に切り出
す、射影抽出部24では、切出された文字行について、
行と垂直な方向に投影し、黒の有無を1111+。
First, the reading section 21 reads the object to be read, such as characters on the form, and the binarization section 22 converts black into 111 P+ and white into 110.
II, and the line cutting unit 23 cuts out each character line.The projection extracting unit 24 then converts the extracted character lines into
Project in the direction perpendicular to the rows and check the presence or absence of black 1111+.

“OIIで表わす。この結果は孤立パターンと射影の関
係を示す第3図かられかるが、ここで同図において、3
0〜32を射影、33〜35をイメージであり、射影を
求める操作を射影の抽出と呼ぶことにする。
This result can be seen from Figure 3, which shows the relationship between isolated patterns and projections.
0 to 32 are projections, and 33 to 35 are images, and the operation to obtain the projection will be referred to as projection extraction.

射影の抽出によって得られたre 1 n、II O#
の射影の情報は、射影記憶部25に記憶される。孤立パ
ターン切出部26では、抽出した射影について左右を1
10 II (白)に挟まれた連続する1”(黒)の射
影部分に対応する文字行上の第3図のイメージ33゜3
4.35を孤立パターンとして切出し、個々の孤立パタ
ーンを1文字と考えて、認識部29の辞書メモリに予め
2Bしてある文字パターンと照合を行い、該当するもと
を探す。基本的には、この方法で文字の認識が可能であ
る。
re 1 n, II O# obtained by extraction of projections
Information on the projection is stored in the projection storage section 25. In the isolated pattern cutting unit 26, the left and right sides of the extracted projection are
Image 33゜3 of Figure 3 on the character line corresponding to the continuous 1” (black) projected part sandwiched between 10 II (white)
4.35 is cut out as an isolated pattern, each isolated pattern is considered as one character, and matched with character patterns stored in 2B in the dictionary memory of the recognition unit 29 in advance to find the corresponding source. Basically, characters can be recognized using this method.

しかし、印刷文字の場合には、複数個の文字が連続する
ことがあり、第3図(b)に示すように孤立パターンが
必ずしも1文字に対応するとは限らない。そこで、印刷
文字の場合には、第2図のパターン幅判定部27で第3
図(a)の孤立パターンの幅Aを検出し、これが最大文
字幅を越えるときには、第2図の強制切出部28におい
て、最大文字幅で切出しを行い、認識部29の中の辞書
メモリ内の文字パターンと照合し認識を行う6 (発明が解決しようとする問題点) ゛しかしながら、
上記の方法では、連続文字があった場合、最大文字幅で
強制切出しを行うため。
However, in the case of printed characters, a plurality of characters may be continuous, and an isolated pattern does not necessarily correspond to one character as shown in FIG. 3(b). Therefore, in the case of printed characters, the pattern width determination section 27 in FIG.
The width A of the isolated pattern shown in FIG. 6. (Problem to be solved by the invention) ゛However,
In the above method, if there are consecutive characters, they are forcibly cut out using the maximum character width.

文字を正しく切出せないことがある。例えば、第4図(
a)に示すような連続文字は、強制切出しの結果、同図
(b)のようにfとiの半分を1文字、iの半分を別の
1文字として、認識しようとする。
Characters may not be cut out correctly. For example, in Figure 4 (
As a result of forced extraction, continuous characters as shown in a) are recognized as half of f and i as one character, and half of i as another character, as shown in FIG.

したがって、このように切出された文字は、リジェクト
されたり、誤認識されたりしてしまうという問題点があ
った。
Therefore, there is a problem in that characters cut out in this manner are rejected or misrecognized.

本発明はこれらの問題点を解決するためのもので、連続
文字があった場合でも、リジェクトや誤認識を少なくし
、より正しく印刷文字を認識することのできる認識率の
優れた光学式文字読取装置を提供することを目的とする
The present invention is intended to solve these problems, and is an optical character reader with an excellent recognition rate that can reduce rejects and misrecognitions and more accurately recognize printed characters even when there are continuous characters. The purpose is to provide equipment.

(問題点を解決するための手段) 本発明は眞記問題点を解決するために、帳票上の読取対
象を光学的に読取り、さらに2値化して画像信号を得る
読取部と、画像信号を文字行毎に切出す行切出部と、こ
の行切出部により切出された1文字行に含まれる画像信
号を行と直交する方向に投影して射影を抽出する射影抽
出部と、この射影抽出部により抽出した射影を一時格納
する射影記憶部と、この射影抽出部により抽出した射影
の黒の射影部分のみに相当する画像48号を孤立パター
ンとして切出す孤立パターン切出部と、この孤立パター
ン切出部により切出された孤立パターンの幅を所定の文
字幅と比較し、所定の文字幅以内のときは孤立パターン
は1文字に相当するものとし、所定の文字幅より大きい
ときは孤立パターンは複数の文字が連続して構成する連
続文字に相当するものと判定するパターン幅判定部と、
1文字の文字パターンを予め登録してある単独文字辞書
メモリと、連続文字の文字パターンを予め登録してある
連続文字辞書メモリと、パターン幅判定部の判定結果に
基づいて、孤立パターンを単独辞書メモリまたは連続文
字辞書メモリに登録してある文字パターンと、孤立パタ
ーンから抽出した特徴と照合して読取対象を認識する認
識部とを具備している。
(Means for Solving the Problem) In order to solve the problem, the present invention provides a reading section that optically reads an object to be read on a form, further converts it into a binary value to obtain an image signal, and a reading section that obtains an image signal. a line cutting section that cuts out each character line; a projection extraction section that extracts a projection by projecting an image signal included in one character line cut out by the line cutting section in a direction perpendicular to the line; a projection storage section that temporarily stores the projection extracted by the projection extraction section; an isolated pattern cutting section that cuts out image No. 48 corresponding only to the black projection part of the projection extracted by the projection extraction section as an isolated pattern; The width of the isolated pattern cut out by the isolated pattern cutting section is compared with a predetermined character width, and if it is within the predetermined character width, the isolated pattern is considered to correspond to one character, and if it is larger than the predetermined character width, the isolated pattern is considered to be equivalent to one character. a pattern width determination unit that determines that the isolated pattern corresponds to a continuous character composed of a plurality of consecutive characters;
An isolated pattern is divided into an independent dictionary based on the judgment result of the pattern width judgment section, a single character dictionary memory in which character patterns of one character are registered in advance, a continuous character dictionary memory in which character patterns of continuous characters are registered in advance. It includes a recognition unit that recognizes a reading target by comparing character patterns registered in a memory or continuous character dictionary memory with features extracted from isolated patterns.

(作用) 以上のような構成を有する本発明によれば、読取部は帳
票上の読取対象を光学的に読取り、さらに2値化して画
像信号を得る。そして1行切出部では画像信号を文字行
毎に切出される。この切出された1文字行に含まれる画
像信号は射影抽出部により行と直交する方向に投影され
て射影が抽出されて射影記憶部に一時格納される8そし
て、孤立パターン切出部は抽出した射影の黒の射影部分
のみに相当する画像信号を孤立パターンとして切出す、
パターン幅判定部では切出された孤立パターンの幅を所
定の文字幅と比較する。そして、比較した結果、孤立パ
ターンの幅が所定の文字幅以内のときは、切出された孤
立パターンは1文字に相当するものと判定する。一方、
孤立パターンの幅が所定の文字幅より大きいときは、切
出された孤立パターンは複数の文字が連続して構成する
連続文字に相当するものと判定する。そして、この判定
結果、認識部では孤立パターンから抽出した特徴と、所
定の文字幅以内のときは孤立パターンを単独文字辞書メ
モリに予め登録してある1文字の文字パターンとを照合
し、所定の幅より大きいときは孤立パターンを連続文字
辞書メモリに予め登録してある連続文字の文字パターン
とを照合して帳票上の読取対象を認識する。
(Function) According to the present invention having the above-described configuration, the reading section optically reads the object to be read on the form, and further binarizes the object to be read to obtain an image signal. Then, in the one-line cutting section, the image signal is cut out for each character line. The image signal included in this cut out one character line is projected in the direction orthogonal to the line by the projection extraction section, the projection is extracted, and temporarily stored in the projection storage section 8. Then, the isolated pattern cut out section is extracted. The image signal corresponding only to the black projection part of the projection is extracted as an isolated pattern.
The pattern width determination section compares the width of the cut out isolated pattern with a predetermined character width. Then, as a result of the comparison, if the width of the isolated pattern is within a predetermined character width, it is determined that the cut out isolated pattern corresponds to one character. on the other hand,
When the width of the isolated pattern is larger than the predetermined character width, it is determined that the cut out isolated pattern corresponds to a continuous character consisting of a plurality of consecutive characters. As a result of this determination, the recognition unit compares the features extracted from the isolated pattern with the character pattern of one character, which is registered in advance in the single character dictionary memory, if the isolated pattern is within a predetermined character width, and If it is larger than the width, the isolated pattern is compared with a character pattern of continuous characters registered in advance in a continuous character dictionary memory to recognize the object to be read on the form.

したがって、本発明は前記問題点を解決することができ
、作業効率の良好で、かつ認識率の優れた光学式文字読
取装置を提供できる。
Therefore, the present invention can solve the above-mentioned problems, and can provide an optical character reading device with good working efficiency and excellent recognition rate.

(実施例) 以下、本発明の一実施例を図面に基づいて説明する。(Example) Hereinafter, one embodiment of the present invention will be described based on the drawings.

第1図は本発明の一実施例を示すブロック図である。同
図において、1は帳票上の読取対象を光学的に読取る読
取部、2は読取部1により読取ったアナログ信号又は1
6階調のディジタル信号をある閾値に基づいて黒を1”
、白を“0′″として2値化して画像信号を得る2値化
部、3は2値化部2により得た画像信号を文字行毎に切
出す行切出部、4は行切出部3によって切出された文字
行に含まれる画像信号を行と直交する方向に投影し。
FIG. 1 is a block diagram showing one embodiment of the present invention. In the figure, 1 is a reading unit that optically reads an object to be read on a form, and 2 is an analog signal read by the reading unit 1 or 1
Black is 1" based on a certain threshold value from a 6-gradation digital signal.
, a binarization unit that binarizes white as “0′” and obtains an image signal; 3 is a line cutting unit that cuts out the image signal obtained by the binarization unit 2 for each character line; 4 is a line cutting unit The image signal included in the character line cut out by unit 3 is projected in a direction perpendicular to the line.

黒の有無を“# l II 、  It Q Itで表
した射影を抽出する射影抽出部、5は射影抽出部4によ
り抽出した射影のデータを一時記憶する射影記憶部、6
は射影抽出部4により抽出し射影記憶部5に記憶されて
いる射影において左右を“0”(白)に扶まれた連続す
るII l l# (黒)の射影部分に相当する画像信
号を孤立パターンとして切出す孤立パターン切出部、7
は孤立パターン切出部6により切出された孤立パターン
の幅を検出し、検出した幅を1文字とみなすことができ
る限度を示す最大文字幅と比較して比較判定の結果を後
述する辞書切替部13に供給するパターン幅判定部、8
は、パターン幅判定部7を介した孤立パターンを後述す
る各辞書メモリに予め登録してある文字パターンと照合
して認識を行う認識部、9は1文字の文字パターンを予
め登録してある単独文字辞書メモリ、10は複数の文字
が連続して構成する連続文字の文字パターンを予め登録
してある連続文字辞書メモリ、11は孤立パターン切出
部6により切出された孤立パターンから特徴を抽出する
特徴抽出部、12は特徴抽出部11により抽出された特
徴と単独文字辞書メモリ9又は連続文字辞書メモリ10
に登録されている文字パターンとを照合して文字判定を
行う判定部。
5 is a projection storage unit that temporarily stores the projection data extracted by the projection extraction unit 4; 6;
In the projection extracted by the projection extraction unit 4 and stored in the projection storage unit 5, the image signal corresponding to the continuous II l l # (black) projection part with “0” (white) on the left and right sides is isolated. Isolated pattern cutting part to cut out as a pattern, 7
detects the width of the isolated pattern cut out by the isolated pattern cutting unit 6, compares the detected width with the maximum character width indicating the limit that can be considered as one character, and compares and judges the result of the dictionary switching, which will be described later. a pattern width determining unit 8 that supplies the pattern width to the unit 13;
9 is a recognition unit that performs recognition by comparing the isolated pattern passed through the pattern width determination unit 7 with character patterns pre-registered in each dictionary memory, which will be described later; A character dictionary memory 10 is a continuous character dictionary memory in which character patterns of continuous characters constituted by a plurality of consecutive characters are registered in advance; 11 is a character dictionary memory for extracting features from isolated patterns cut out by the isolated pattern cutting unit 6; A feature extractor 12 extracts the features extracted by the feature extractor 11 and a single character dictionary memory 9 or a continuous character dictionary memory 10.
A determination unit that performs character determination by comparing character patterns registered in the .

13はパターン幅判定部7の判定結果に基づいて判定部
12で照合するために単独文字辞書メモリ9又は連続文
字辞書メモリ10を切替える辞書切替部である。
Reference numeral 13 denotes a dictionary switching unit that switches between the single character dictionary memory 9 and the continuous character dictionary memory 10 for comparison in the determination unit 12 based on the determination result of the pattern width determination unit 7.

次に、第1図を用いて本実施例の動作を説明する。Next, the operation of this embodiment will be explained using FIG.

先ず、読取部1は、帳票上の文字等の読取対象を読取り
、黒白の!!l淡の情報をアナログ信号又は、16階調
のディジタル信号として出力する。2値化部2では、黒
白の濃淡情報をある閾値で2値化し。
First, the reading unit 1 reads the object to be read, such as characters on the form, and reads the black and white! ! It outputs the information of 1 light and dark as an analog signal or a 16-gradation digital signal. The binarization unit 2 binarizes black and white shading information using a certain threshold.

思をre 1 n、白を1101+として2値のディジ
タル信号として出力する。これを行切出部3で文字行ご
とに切出し、射影抽出部4では文字行上のイメージを行
と直交する方向に投影し、射影を抽出して、射影記憶部
5に格納する。さらに、孤立パターン切出部6では、抽
出した射影からII O$1 (白)の部分を探し、1
1071 (白)に挟まれた連続するH I II (
黒)の射影部分に対応する文字行上のイメージを孤立パ
ターンとして切出す。
The image is output as a binary digital signal with re 1 n and white as 1101+. The line cutting unit 3 cuts out each character line, and the projection extraction unit 4 projects the image on the character line in a direction perpendicular to the line, extracts the projection, and stores it in the projection storage unit 5. Furthermore, the isolated pattern extraction unit 6 searches for the II O$1 (white) part from the extracted projection, and
1071 (white) Continuous H I II (
The image on the character line corresponding to the projected part (black) is cut out as an isolated pattern.

さて、切出された個々の孤立パターンは、1文字に対応
するものもあれば、第3図(a)に示すように2文字あ
るいは第3図(b)に示すようにそれ以上の連続文字に
対応するものもある。しかし、フォント指定により決ま
る最大文字幅を越えるパターン幅を有する孤立パターン
は、複数個の文字が連続したものであると考えることが
できる。そこで、パターン幅判定部7で個々の孤立パタ
ーンの幅を判定し、パターンデータと共に認識部8ヘパ
ターン幅の情報も送る。L!3識部8では、まず特徴抽
出部11で各孤立パターンから特徴を抽出しデータを判
定部12に送る。判定部12では、データを各辞書メモ
リに登録してある文字パターンと照合し1文字判定を行
うが、単独文字か連続文字かにによって辞書メモリを切
替える必要がある。第1図に示した例では、辞書切替部
13を設け、パターン幅検出部7で得られたパターン幅
の判定結果に基づいて孤立パターンの幅が最大文字幅以
内であれば、その孤立パターンは1文字に対応するもの
として単独文字辞書メモリ9に登録してある文字パター
ンで照合を行い、孤立パターンの幅が最大文字幅を越え
る場合には、w数個の文字が連続したものとして、連続
文字辞書メモリ10に登録してある文字パターンで照合
を行い、判定部12の判定結果を認識結果として出力す
る。
Now, some of the cut out individual isolated patterns correspond to one character, while others correspond to two characters as shown in Figure 3(a) or more consecutive characters as shown in Figure 3(b). There are also some that correspond to However, an isolated pattern having a pattern width exceeding the maximum character width determined by the font specification can be considered to be a plurality of consecutive characters. Therefore, the pattern width determining section 7 determines the width of each isolated pattern, and sends pattern width information to the recognizing section 8 along with the pattern data. L! In the third identification section 8 , first, the feature extraction section 11 extracts features from each isolated pattern and sends the data to the determination section 12 . The determination unit 12 performs single character determination by comparing data with character patterns registered in each dictionary memory, but it is necessary to switch dictionary memories depending on whether the data is a single character or continuous characters. In the example shown in FIG. 1, the dictionary switching unit 13 is provided, and if the width of the isolated pattern is within the maximum character width based on the pattern width determination result obtained by the pattern width detection unit 7, the isolated pattern is Matching is performed using a character pattern registered in the single character dictionary memory 9 that corresponds to one character, and if the width of the isolated pattern exceeds the maximum character width, it is assumed that w several characters are consecutive. Matching is performed using character patterns registered in the character dictionary memory 10, and the determination result of the determination unit 12 is output as a recognition result.

ここで、連続文字辞書メモリに登録してある文字パター
ンについて説明する。連続文字は、F。
Here, character patterns registered in the continuous character dictionary memory will be explained. Consecutive letters are F.

f、r、Tなどのように、文字の上部が左右又は、その
どちらか一方に広がった文字と小文字、特にiなどとの
組合せが多く、ある程度組合せの種類を特定できるので
、全ての組合せを辞書として持つ必要はない。また、文
字の連続数は2個のものがほとんどで、3個のものは少
なく、4個以上は希である。辞書メモリに登録してある
文字パターンとしては、2文字連続のものと3文字連続
のものを用意しておけば、最大文字幅を越えるものの大
部分には対応できる。したがって、最大文字幅を戴える
孤立パターンが切出された場合には、まず連続文字辞書
メモリの中の2文字連続の辞書メモリ内の文字パターン
と照合を行い、該当する組合せがあれば、それを認識結
果として出力し、該当する組合せがなければ、3文字連
続の辞書メモリ内の文字パターンと照合を行い、該当す
る組合せがあれば、それを認識結果として出力し、それ
でも該当するものがなければリジェクトする。
There are many combinations of letters in which the upper part of the letter spreads to the left or right, or to either side, such as f, r, and T, and lowercase letters, especially i, and it is possible to identify the types of combinations to some extent, so all combinations can be identified. There is no need to have it as a dictionary. Furthermore, the number of consecutive characters is mostly two, rarely three, and rarely four or more. By preparing two consecutive character patterns and three consecutive character patterns as character patterns registered in the dictionary memory, most characters exceeding the maximum character width can be handled. Therefore, when an isolated pattern with the maximum character width is cut out, first it is compared with the character pattern in the dictionary memory of two consecutive characters in the continuous character dictionary memory, and if there is a corresponding combination, it is is output as a recognition result, and if there is no matching combination, it is compared with the character pattern in the dictionary memory of three consecutive characters, and if there is a matching combination, it is output as a recognition result, and if there is still no matching combination, it is output as a recognition result. will be rejected.

(発明の効果) 以上説明したように、本発明によれば、連続文字があっ
た場合でも、連続文字の文字パターンを予め登録してあ
る辞書メモリを設けて連続文字のまま照合し、かつ認識
することにより、リジェクトや誤認識を少ない、即ち作
用効率が良く、さらに認識率の優れた光学式文字読取装
置を提供することができる。
(Effects of the Invention) As explained above, according to the present invention, even when there are consecutive characters, a dictionary memory in which character patterns of the consecutive characters are registered in advance is provided, and the consecutive characters are collated as they are and recognized. By doing so, it is possible to provide an optical character reading device with fewer rejections and erroneous recognitions, that is, with good operational efficiency and an excellent recognition rate.

【図面の簡単な説明】[Brief explanation of drawings]

第1図は本発明の一実施例を示すブロック図、第2図は
従来の光学式文字読取装置を示すブロック図、第3図は
孤立パターンと射影の関係を示す図、第4図は連続文字
の強制切出しの一例を示す図である。 103.読取部、     2・・・2値化部、3・・
・行切出部、   4・・・射影抽出部。 5・・・射影記憶部、  6・・・孤立パターン切出部
、7・・・パターン幅判定部、 8・・・認識部、    9・・・単独文字辞書メモリ
。 lO・・・連続文字辞書メモリ、 11・・・特徴抽出部、  12・・・判定部、13・
・・辞書切替部。
Fig. 1 is a block diagram showing an embodiment of the present invention, Fig. 2 is a block diagram showing a conventional optical character reading device, Fig. 3 is a diagram showing the relationship between isolated patterns and projection, and Fig. 4 is a continuous diagram. FIG. 3 is a diagram illustrating an example of forced cutting out of characters. 103. Reading section, 2...Binarization section, 3...
- Line cutting section, 4... Projection extraction section. 5... Projection storage unit, 6... Isolated pattern extraction unit, 7... Pattern width determination unit, 8... Recognition unit, 9... Single character dictionary memory. lO... Continuous character dictionary memory, 11... Feature extraction section, 12... Judgment section, 13.
...Dictionary switching section.

Claims (1)

【特許請求の範囲】 帳票上の読取対象を光学的に読取り、さらに2値化して
画像信号を得る読取部と、 前記画像信号を文字行毎に切出す行切出部と、該行切出
部により切出された1文字行に含まれる前記画像信号を
行と直交する方向に投影して射影を抽出する射影抽出部
と、 該射影抽出部により抽出した前記射影を一時格納する射
影記憶部と、 前記該射影抽出部により抽出した前記射影の黒の射影部
分のみに相当する画像信号を孤立パターンとして切出す
孤立パターン切出部と、 該孤立パターン切出部により切出された孤立パターンの
幅を所定の文字幅と比較し、所定の文字幅以内のときは
前記孤立パターンは1文字に相当するものとし、所定の
文字幅より大きいときは前記孤立パターンは複数の文字
が連続して構成する連続文字に相当するものと判定する
パターン幅判定部と、 1文字の文字パターンを予め登録してある単独文字辞書
メモリと、 前記連続文字の文字パターンを予め登録してある連続文
字辞書メモリと、 前記パターン幅判定部の判定結果に基づいて、前記孤立
パターンを前記単独辞書メモリまたは前記連続文字辞書
メモリに登録してある文字パターンと、前記孤立パター
ンから抽出した特徴と照合して前記読取対象を認識する
認識部とを具備することを特徴とする光学式文字読取装
置。
[Scope of Claims] A reading unit that optically reads an object to be read on a form and further binarizes it to obtain an image signal; a line cutting unit that cuts out the image signal for each character line; and a line cutting unit that cuts out the image signal for each character line. a projection extraction section that extracts a projection by projecting the image signal included in one character line extracted by the section in a direction perpendicular to the line; and a projection storage section that temporarily stores the projection extracted by the projection extraction section. and an isolated pattern cutting section that cuts out, as an isolated pattern, an image signal corresponding only to the black projection portion of the projection extracted by the projection extraction section; The width is compared with a predetermined character width, and if the width is within the predetermined character width, the isolated pattern is considered to correspond to one character, and if it is larger than the predetermined character width, the isolated pattern is composed of a plurality of consecutive characters. a pattern width determination unit that determines that the pattern corresponds to a continuous character; a single character dictionary memory in which a character pattern of one character is registered in advance; and a continuous character dictionary memory in which a character pattern of the continuous character is registered in advance. , Based on the determination result of the pattern width determination section, the isolated pattern is compared with a character pattern registered in the individual dictionary memory or the continuous character dictionary memory and a feature extracted from the isolated pattern to determine the reading target. An optical character reading device comprising a recognition unit that recognizes.
JP61103863A 1986-05-08 1986-05-08 Optical character reader Granted JPS62262194A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP61103863A JPS62262194A (en) 1986-05-08 1986-05-08 Optical character reader

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP61103863A JPS62262194A (en) 1986-05-08 1986-05-08 Optical character reader

Publications (2)

Publication Number Publication Date
JPS62262194A true JPS62262194A (en) 1987-11-14
JPH0576674B2 JPH0576674B2 (en) 1993-10-25

Family

ID=14365284

Family Applications (1)

Application Number Title Priority Date Filing Date
JP61103863A Granted JPS62262194A (en) 1986-05-08 1986-05-08 Optical character reader

Country Status (1)

Country Link
JP (1) JPS62262194A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0816720A (en) * 1994-06-29 1996-01-19 Nec Corp Character recognition device
US6738519B1 (en) 1999-06-11 2004-05-18 Nec Corporation Character recognition apparatus

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0816720A (en) * 1994-06-29 1996-01-19 Nec Corp Character recognition device
US6738519B1 (en) 1999-06-11 2004-05-18 Nec Corporation Character recognition apparatus

Also Published As

Publication number Publication date
JPH0576674B2 (en) 1993-10-25

Similar Documents

Publication Publication Date Title
US4757551A (en) Character recognition method and system capable of recognizing slant characters
JP2553608B2 (en) Optical character reader
JPH0564834B2 (en)
KR100383858B1 (en) Character extracting method and device
US4769851A (en) Apparatus for recognizing characters
JPS58103075A (en) Character reader
JP2011257896A (en) Character recognition method and character recognition apparatus
JPS62262194A (en) Optical character reader
JP2894111B2 (en) Comprehensive judgment method of recognition result in optical type character recognition device
JP3391223B2 (en) Character recognition device
JPH0877293A (en) Character recognition device and generating method for dictionary for character recognition
JPS6160184A (en) Optical character reader
JPH03122786A (en) Optical character reader
JPH1011541A (en) Character recognition device
JPH05135204A (en) Character recognition device
JPH0731711B2 (en) Optical character reader
JPH01234985A (en) Character segmenting device for character reader
JPH1040338A (en) Optical character reader
JPH0969139A (en) Optical character reading method and its device
JPS5914078A (en) Reader of business form
JPS61177581A (en) Optical character reader
JPH1196295A (en) Information processor and its method
JPS5851390A (en) Font character recognizing device
JP2000339404A (en) Device and method for recognizing document
JPS61177582A (en) Optical character reader