JPH02135586A - Optical character reader - Google Patents

Optical character reader

Info

Publication number
JPH02135586A
JPH02135586A JP63289668A JP28966888A JPH02135586A JP H02135586 A JPH02135586 A JP H02135586A JP 63289668 A JP63289668 A JP 63289668A JP 28966888 A JP28966888 A JP 28966888A JP H02135586 A JPH02135586 A JP H02135586A
Authority
JP
Japan
Prior art keywords
character
binarization
threshold
unit
binarizes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP63289668A
Other languages
Japanese (ja)
Inventor
Norio Hamada
濱田 徳郎
Masahiro Kawaguchi
川口 正宏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Priority to JP63289668A priority Critical patent/JPH02135586A/en
Publication of JPH02135586A publication Critical patent/JPH02135586A/en
Pending legal-status Critical Current

Links

Abstract

PURPOSE:To recognize both dark character and blurred character by providing first and second means, which respectively binarize signals with a threshold following the white level of a background and the threshold based on the blackest mesh of the circumferential picture element, providing the synthetic deciding parts of a character recognition result corresponding to each binarization in parallel, and parallel- processing the plural binarizing systems. CONSTITUTION:A preprocessing part 2 consists of a first binarizing means 9, which binarizes a quantizing electric signal outputted from a scanner 1 by the threshold following the white level of the background of a paper surface, a second binarizing means 10, which binarizes the signal by the threshold calculated based on the blackest mesh of the circumferential picture element, etc. A character recognizing part 3 provides a recognition control part 17, which matches a standard character pattern stored into a dictionary memory 16 to the picture pattern of each input character from the preprocessing part 2 and recognizes the input character, and further a synthetic deciding part 4 is provided in parallel on an output stage. The synthetic deciding part 4 decides the character recognition result by means of the binarization based on majority decision. Thus, the dark character and blurred character can be more correctly recognized.

Description

【発明の詳細な説明】 〔産業上の利用分野〕 本発明は、帳票等の紙面上に書かれた文字画像パターン
を光学的な走査及び光電変換の手法を用いて求めるとと
もに、この画像パターンから文字を認識する光学式文字
読取装置(OCR)に関する。
[Detailed Description of the Invention] [Industrial Application Field] The present invention obtains a character image pattern written on paper such as a form by using optical scanning and photoelectric conversion techniques, and also obtains a character image pattern written on a paper surface such as a form, and The present invention relates to an optical character reader (OCR) that recognizes characters.

〔従来の技術〕[Conventional technology]

−iに、この種の光学式文字読取装置においては、入力
される画像データの濃度値を白黒の2値に量子化するた
めに、2値化の閾値の決定が重要である。
-i, in this type of optical character reading device, in order to quantize the density value of input image data into binary values of black and white, it is important to determine a threshold value for binarization.

この閾値の決め方には、大別して固定閾値法と浮動2値
化法とがある。
Methods for determining this threshold value can be roughly divided into a fixed threshold method and a floating binarization method.

文字と背景との濃度値が既知であり、入力画像ごとに変
化が少ない場合には、予め経験的に閾値を決める固定閾
値法の方が単純で実用的である。
If the density values of text and background are known and change little from input image to input image, a fixed threshold method in which the threshold is determined empirically in advance is simpler and more practical.

これに対し、用紙・文字などの入力条件が変化する場合
には、入力画像に応じて閾値を決めることが必要となる
。このため、帳票の手書きの文字の読み取り等の場合は
、浮動2値化法が比較的多く採用されている。
On the other hand, when input conditions such as paper and characters change, it is necessary to determine the threshold value according to the input image. For this reason, the floating binarization method is relatively often used for reading handwritten characters on forms.

この浮動2値化法は、そのレベルの設定に際しては、い
くつかの手法がある。この内、多くは帳票の背景の白レ
ベルに追従した閾値で2値化する白レベル追従型閾値法
、又は周囲のメツシュの黒レベルに追従した閾値で2値
化する黒レベル追従型可変閾値法の何れかが用いられて
いる。
In this floating binarization method, there are several methods for setting the level. Among these, most of them are the white level tracking type threshold method, which performs binarization using a threshold that follows the white level of the background of the form, or the black level tracking type variable threshold method, which performs binarization using a threshold that follows the black level of the surrounding mesh. Either is used.

〔発明が解決しようとする課題〕[Problem to be solved by the invention]

しかしながら、上記従来例における白レベル追従型閾値
法にあっては、帳票の背景の白レベルを基準として閾値
が決定されるため、例えば濃い文字がつぶれてしまうと
いう不都合が生じていた。
However, in the white level tracking type threshold method in the above-mentioned conventional example, the threshold value is determined based on the white level of the background of the form, resulting in the problem that, for example, dark characters are crushed.

時には、掠れた文字の薄い部分が消えてしまう場合もあ
った。一方、上記従来例の黒レベル追従型可変閾値法で
2値化を行う場合には、周囲のメツシュの黒レベルに追
従した形で、閾値が決まるため、帳票の白い部分の汚れ
、折り目等、白レベル追従型閾値法では拾わなかった雑
音を袷ってしまうという不都合が生じていた。
Sometimes, the thin parts of the blurred letters would disappear. On the other hand, when performing binarization using the conventional black level tracking type variable threshold method described above, the threshold value is determined by following the black level of the surrounding mesh, so stains, creases, etc. in the white part of the form, etc. The white level tracking type threshold method had the disadvantage of picking up noise that was not picked up.

このため、何れの場合にも誤読が頻繁に生じたり、該当
する標準文字が見出されなかったり、2つ以上の標準文
字が候補文字としてあがり、読み取り不能としてリジェ
クトされるという不都合があった。
Therefore, in either case, misreading occurs frequently, the corresponding standard character is not found, or two or more standard characters are selected as candidate characters and rejected as unreadable.

〔発明の目的〕[Purpose of the invention]

本発明の目的は、かかる従来例の有する不都合を改善し
、とくに、濃い文字、掠れた文字の双方をより正確に認
識することが可能な光学式文字読取装置を提供すること
にある。
SUMMARY OF THE INVENTION An object of the present invention is to provide an optical character reading device that can improve the disadvantages of the conventional example and, in particular, can more accurately recognize both dark characters and blurred characters.

〔課題を解決するための手段] 本発明では、紙面上の文字を光学的に走査して得られる
反射光を光電変換する光電変換部と、この光電変換部の
出力信号の二値化等を行う前処理部と、この前処理部か
らの出力信号に基づき求められる入力文字パターンと既
知の標準文字パターンとを比較・判断して当該文字を認
識する文字認識部とを備えている。そして、とくに、前
処理部が、少なくとも光電変換部から出力される量子化
電気信号を紙面の背景の白レベルに追従した閾値で2値
化する第1の2値化手段と、周囲の画素の最も黒いメツ
シュを基に計算された閾値で二値化する第2の2値化手
段とを有するとともに、文字認識部に、各2値化に対応
する文字認識の結果を総合判定する総合判定部を併設す
るという構成を採っている。これによって、前述した目
的を達成しようとするものである。
[Means for Solving the Problems] The present invention includes a photoelectric conversion unit that photoelectrically converts reflected light obtained by optically scanning characters on a paper surface, and binarization of an output signal of this photoelectric conversion unit. and a character recognition section that compares and judges an input character pattern obtained based on an output signal from the preprocessing section with a known standard character pattern to recognize the character. In particular, the preprocessing section includes a first binarization means that binarizes at least the quantized electrical signal output from the photoelectric conversion section using a threshold value that follows the white level of the background of the paper, and a second binarization means that binarizes using a threshold calculated based on the blackest mesh, and a character recognition unit that includes a comprehensive judgment unit that comprehensively judges the result of character recognition corresponding to each binarization. The structure is such that it also has a This aims to achieve the above-mentioned objective.

〔発明の実施例〕[Embodiments of the invention]

以下、本発明の一実施例を第1図に基づいて説明する。 An embodiment of the present invention will be described below with reference to FIG.

この第1図の実施例は、帳票等の紙面上の文字を光学的
に走査して得られる反射光を光電変換する光電変換部と
してのスキャナ1と、このスキャナ1の出力信号の二値
化等を行う前処理部2と、この前処理部2からの出力信
号に基づき求められる入力文字パターンと既知の標準文
字パターンとを比較(マツチング)・判断して当該文字
を認識する文字認識部3とを備えている。
The embodiment shown in FIG. 1 includes a scanner 1 as a photoelectric conversion unit that photoelectrically converts reflected light obtained by optically scanning characters on a paper surface such as a form, and a binarization of the output signal of this scanner 1. a preprocessing unit 2 that performs the above operations, and a character recognition unit 3 that recognizes the character by comparing (matching) and determining the input character pattern obtained based on the output signal from the preprocessing unit 2 with a known standard character pattern. It is equipped with

この内、スキャナ1は、図示しない照明、レンズ フィ
ルタ等からなる光学系5と、照明からの光を紙面で反射
せしめフィルタを介してレンズで集光された反射光を受
けこれを光電変換する光電変換素子としてのCCDセン
サ6と、このCCDセンサ6から出力されるアナログ信
号を増幅するプリアンプ7と、このプリアンプ7の出力
信号をアナログ−デジタル変換(A/D変換)して量子
化電気信号に変換するA/Dコンバータ8とによって構
成されている。本実施例では、CCDセンサ6には2次
元イメージセンサである2次元CCDセンサが用いられ
、文字イメージの2次元走査が当言亥CCDセンサ6に
よって行われるようになっている。
Among these, the scanner 1 includes an optical system 5 consisting of illumination, lens filters, etc. (not shown), and a photoelectric system 5 that reflects light from the illumination on a paper surface, receives reflected light that is collected by a lens through a filter, and converts it into electricity. A CCD sensor 6 as a conversion element, a preamplifier 7 that amplifies the analog signal output from the CCD sensor 6, and an analog-to-digital conversion (A/D conversion) of the output signal of the preamplifier 7 into a quantized electrical signal. and an A/D converter 8 for conversion. In this embodiment, a two-dimensional CCD sensor, which is a two-dimensional image sensor, is used as the CCD sensor 6, and the two-dimensional scanning of the character image is performed by the CCD sensor 6.

前処理部2は、スキャナ1から出力される量子化電気信
号を紙面の背景の白レベルに追従した閾値で2値化する
第1の2値化手段9と、周囲の画素の最も黒いメツシュ
を基に計算された閾値で二値化する第2の2値化手段1
0と、予め経験的に決められた閾値で2値化する第3の
2値化手段11と、これら3つの2値化手段9,10.
11の出力段にそれぞれ設けられて各2値化手段唖より
2値に量子化された画像信号を2次元的にストアする第
1.第2.第3の画像メモリ12.1314と、これら
の各画像メモリにストアされた画像信号(以下、単に「
画像パターン」と呼ぶ)の雑音除去、変形矯正を行う前
処理手段15とにより構成されている。
The preprocessing unit 2 includes a first binarization unit 9 that binarizes the quantized electrical signal output from the scanner 1 using a threshold value that follows the white level of the background of the paper, and a first binarization unit 9 that binarizes the quantized electric signal output from the scanner 1 using a threshold value that follows the white level of the background of the paper surface. Second binarization means 1 that binarizes using a threshold calculated based on
0, a third binarization means 11 that binarizes with a threshold value determined empirically in advance, and these three binarization means 9, 10 .
The first . Second. The third image memory 12.1314 and the image signals stored in each of these image memories (hereinafter simply referred to as "
The preprocessing means 15 performs noise removal and deformation correction of the image pattern (referred to as "image pattern").

文字認識部3は、予め所定の標準文字パターンが記憶さ
れている辞書メモリ16と、この辞書メモリ16に記憶
された標準文字パターンと前処理部2からの各入力文字
の画像パターンとのマツチングを行い入力文字を認識す
る認識制御11部17とを備えている。
The character recognition unit 3 stores a dictionary memory 16 in which a predetermined standard character pattern is stored in advance, and performs matching between the standard character pattern stored in the dictionary memory 16 and the image pattern of each input character from the preprocessing unit 2. and a recognition control section 11 for recognizing input characters.

更に、本実施例においては、認識制御部17の出力段に
各2値化に対応する文字認識の結果を総合判定する総合
判定部4が併設されている。
Furthermore, in this embodiment, the output stage of the recognition control section 17 is provided with a comprehensive judgment section 4 that comprehensively judges the results of character recognition corresponding to each binarization.

次に、上記実施例の全体的動作を説明する。Next, the overall operation of the above embodiment will be explained.

スキャナl内部のCCD6センサにより取り込まれ光電
変換された画像信号は、プリアンプ7により増幅され、
A/Dコンバータ8にヨリA/D変換され、量子化電気
信号に変換される。このA/Dコンバータ8から出力さ
れる量子化電気信号(デジタル値)が、第1ないし第3
の2値化手段9ないし11に入力される。これらの2値
化手段9ないし11で、この量子化電気信号が並行2値
化処理される。即ち、第1の2値化手段9では白レベル
追従型閾値方式による2値化、第2の2値化手段10で
は黒レベル追従型可変閾値方式による2値化、第3の2
値化手段11では固定閾値方式による2値化が行われる
。そして、これらの2値に量子化された画像信号が第1
.第2.第3の画像メモリ12,13.14に2次元的
にストアされ1、これらの各画像パターンは認識処理に
先立って前処理手段15により雑音除去、変形矯正等の
前処理がなされる。
The image signal captured by the CCD 6 sensor inside the scanner l and subjected to photoelectric conversion is amplified by the preamplifier 7,
The signal is then A/D converted by an A/D converter 8 and converted into a quantized electrical signal. The quantized electric signal (digital value) output from this A/D converter 8 is
The signals are inputted to binarization means 9 to 11 of . These quantized electrical signals are subjected to parallel binarization processing by these binarization means 9 to 11. That is, the first binarization means 9 performs binarization using a white level tracking type threshold method, the second binarization means 10 performs binarization using a black level tracking type variable threshold method, and the third binarization means 9 performs binarization using a black level tracking type variable threshold method.
The digitization means 11 performs binarization using a fixed threshold method. Then, these binary quantized image signals are
.. Second. The image patterns are two-dimensionally stored in third image memories 12, 13, and 14, and are subjected to preprocessing such as noise removal and deformation correction by preprocessing means 15 prior to recognition processing.

その後、認識制御部17では、辞書メモリ16に予め記
憶されている標準文字パターンと、前処理部2からの各
画像パターンとのマツチングを行い入力文字を認識し、
その結果を次段の総合判定部4に出力する。そして、総
合判定部4では、最終的に3種類の二値化による文字認
識結果を多数決により判定する。
After that, the recognition control unit 17 performs matching between the standard character pattern stored in the dictionary memory 16 in advance and each image pattern from the preprocessing unit 2, and recognizes the input character.
The result is output to the comprehensive determination section 4 at the next stage. Then, the comprehensive judgment section 4 finally judges the character recognition results by three types of binary conversion by majority vote.

〔発明の効果〕〔Effect of the invention〕

以上説明したように本発明によると、前処理部が、光電
変換部から出力される量子化電気信号を紙面の背景の白
レベルに追従した閾値で2値化する第1の2値化手段と
、周囲の画素の最も黒いメツシュを基に計算された閾値
で二値化する第2の2値化手段とを有するとともに、文
字認識部に、各2値化に対応する文字認識の結果を総合
判定する総合判定部が併設されていることから、複数の
二値化方式を並行処理させるが出来、これにより一つの
二値化方式を用いた時の欠点を補って濃い文字がつぶれ
ずに且つ掠れた薄い文字も消えることなく認識すること
が可能となる。
As described above, according to the present invention, the preprocessing section includes the first binarization means that binarizes the quantized electrical signal output from the photoelectric conversion section using a threshold value that follows the white level of the background of the paper surface. , a second binarization means that binarizes with a threshold value calculated based on the blackest mesh of surrounding pixels, and a character recognition unit that integrates the character recognition results corresponding to each binarization. Since it is equipped with a comprehensive judgment unit, it is possible to process multiple binarization methods in parallel. It becomes possible to recognize faint and blurred characters without erasing them.

従って、濃い文字、掠れた文字の双方をより正確に認識
することが出来、これにより8亥当する標準文字の認識
率を高めて従来問題となっていた誤読の発生、読み取り
不能としてリジェクトを有効に減少せしめることが出来
るという従来にない優れた光学式文字読取装置を提供す
ることが出来る。
Therefore, both dark characters and blurred characters can be recognized more accurately, which increases the recognition rate of standard characters (80%) and eliminates the conventional problems of misreading and rejecting as unreadable characters. It is possible to provide an unprecedented and excellent optical character reading device that can reduce the number of characters.

【図面の簡単な説明】[Brief explanation of the drawing]

第一図は、本発明の一実施例を示すブロック口である。 1・・・・・・光電変換部としてのスキャナ、2・・・
・・・前処理部、3・・・・・・文字認識部、4・・・
・・・総合判定部、9・・・・・・第1の2値化手段、
10・・・・・・第2の2値化手段。
FIG. 1 shows a block opening showing an embodiment of the present invention. 1...Scanner as a photoelectric conversion unit, 2...
...Preprocessing unit, 3...Character recognition unit, 4...
. . . Comprehensive judgment unit, 9 . . . First binarization means,
10...Second binarization means.

Claims (1)

【特許請求の範囲】[Claims] (1)紙面上の文字を光学的に走査して得られる反射光
を光電変換する光電変換部と、この光電変換部の出力信
号の二値化等を行う前処理部と、この前処理部からの出
力信号に基づき求められる入力文字パターンと既知の標
準文字パターンとを比較・判断して当該文字を認識する
文字認識部とを備えた光学式文字読取装置において、 前記前処理部が、少なくとも前記光電変換部から出力さ
れる量子化電気信号を紙面の背景の白レベルに追従した
閾値で2値化する第1の2値化手段と、周囲の画素の最
も黒いメッシュを基に計算された閾値で二値化する第2
の2値化手段とを有するとともに、 前記文字認識部に、各2値化に対応する文字認識の結果
を総合判定する総合判定部を併設したことを特徴とする
光学式文字読取装置。
(1) A photoelectric conversion unit that photoelectrically converts reflected light obtained by optically scanning characters on a paper surface, a preprocessing unit that performs binarization of the output signal of this photoelectric conversion unit, and this preprocessing unit In the optical character reading device, the preprocessing unit includes at least a first binarization means that binarizes the quantized electrical signal output from the photoelectric conversion unit using a threshold value that tracks the white level of the background of the paper; The second step is to binarize using a threshold value.
1. An optical character reading device comprising: a binarization means, and the character recognition unit further includes a comprehensive determination unit that comprehensively determines the results of character recognition corresponding to each binarization.
JP63289668A 1988-11-16 1988-11-16 Optical character reader Pending JPH02135586A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP63289668A JPH02135586A (en) 1988-11-16 1988-11-16 Optical character reader

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP63289668A JPH02135586A (en) 1988-11-16 1988-11-16 Optical character reader

Publications (1)

Publication Number Publication Date
JPH02135586A true JPH02135586A (en) 1990-05-24

Family

ID=17746205

Family Applications (1)

Application Number Title Priority Date Filing Date
JP63289668A Pending JPH02135586A (en) 1988-11-16 1988-11-16 Optical character reader

Country Status (1)

Country Link
JP (1) JPH02135586A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1324521C (en) * 2003-03-15 2007-07-04 三星电子株式会社 Preprocessing equipment and method for distinguishing image character

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS5759286A (en) * 1980-09-27 1982-04-09 Agency Of Ind Science & Technol Character reader
JPS57192171A (en) * 1981-05-21 1982-11-26 Ricoh Co Ltd Picture processing device
JPS58219682A (en) * 1982-06-14 1983-12-21 Fujitsu Ltd Read system of character picture information
JPS6043556A (en) * 1983-08-17 1985-03-08 株式会社クボタ Falling ridge roofing method
JPS61255486A (en) * 1985-05-09 1986-11-13 Nec Corp Graphic processing unit
JPS63111591A (en) * 1986-10-29 1988-05-16 Sumitomo Electric Ind Ltd Optical character reader
JPS63177284A (en) * 1987-01-19 1988-07-21 Sumitomo Electric Ind Ltd Optical character reader

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS5759286A (en) * 1980-09-27 1982-04-09 Agency Of Ind Science & Technol Character reader
JPS57192171A (en) * 1981-05-21 1982-11-26 Ricoh Co Ltd Picture processing device
JPS58219682A (en) * 1982-06-14 1983-12-21 Fujitsu Ltd Read system of character picture information
JPS6043556A (en) * 1983-08-17 1985-03-08 株式会社クボタ Falling ridge roofing method
JPS61255486A (en) * 1985-05-09 1986-11-13 Nec Corp Graphic processing unit
JPS63111591A (en) * 1986-10-29 1988-05-16 Sumitomo Electric Ind Ltd Optical character reader
JPS63177284A (en) * 1987-01-19 1988-07-21 Sumitomo Electric Ind Ltd Optical character reader

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1324521C (en) * 2003-03-15 2007-07-04 三星电子株式会社 Preprocessing equipment and method for distinguishing image character

Similar Documents

Publication Publication Date Title
JP2553608B2 (en) Optical character reader
US4355301A (en) Optical character reading system
EP0505729B1 (en) Image binarization system
US9262665B2 (en) Decoding method and decoding processing device
EP0144006B1 (en) An improved method of character recognitionand apparatus therefor
JP3906221B2 (en) Image processing method and image processing apparatus
JPH02135586A (en) Optical character reader
KR20000025647A (en) Method for processing image using shading algorithm
JPH10222602A (en) Optical character reading device
EP0504576A2 (en) Document scanner
JPH0131236B2 (en)
JP2894111B2 (en) Comprehensive judgment method of recognition result in optical type character recognition device
JP2590099B2 (en) Character reading method
JP3095437B2 (en) Character line detection cutout device and character reading device
JPH05298482A (en) Character reader
JPS5911153B2 (en) Optical character reading method
JPS62297981A (en) Binarization system for image
JPH04167084A (en) Character reader
JPH04274583A (en) Character reader
JPH07104907B2 (en) Binarization circuit
JPS6160475B2 (en)
JPH04260181A (en) Character reader
JPS5990175A (en) Binary coding circuit
JPS6180373A (en) Recording method of character
JPS6024993B2 (en) Image binarization method