JPS6073793A - Optical character reader - Google Patents

Optical character reader

Info

Publication number
JPS6073793A
JPS6073793A JP58181120A JP18112083A JPS6073793A JP S6073793 A JPS6073793 A JP S6073793A JP 58181120 A JP58181120 A JP 58181120A JP 18112083 A JP18112083 A JP 18112083A JP S6073793 A JPS6073793 A JP S6073793A
Authority
JP
Japan
Prior art keywords
data
inversion
image data
treatment
inverted
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP58181120A
Other languages
Japanese (ja)
Inventor
Masahiro Kojima
雅広 小島
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to JP58181120A priority Critical patent/JPS6073793A/en
Publication of JPS6073793A publication Critical patent/JPS6073793A/en
Pending legal-status Critical Current

Links

Landscapes

  • Character Input (AREA)

Abstract

PURPOSE:To enable the titled reader to recognize characters by the same pre- treatment means and the same dictionary even when the inversion characters are mixed by inverting the inversion character to return it to the pre-treatment part when the inversion character is detected. CONSTITUTION:The pre-feature monitor part 4 monitoring a cut-out image data while cutting out the image data from the operation buffer 6, decides the feature of the easy category that there is no concaved or a cut part at the external border, thereby detecting the inversion image data. When the data is decided to be the inversion image data, its white and black are inverted at the inversion treatment part 9 and the data is returned to the pre-treatment part 5 again, and is binary encoded. Therefore, the dust is removed and smoothing is executed because the data is already binary-coded, then the data is returned to the buffer 6 again. For this reason, the monitor part 4 will work only as an ordinary cut- out control, recognition treatment and editing treatment are executed at the ordinary route, and as a result, the data is outputted to the data memory 11.

Description

【発明の詳細な説明】 く分 野〉 本発明は光学文字読取装置に係り、特に通常の文字の他
に反転文字をも認識するための、レフレックス処理ルー
プを有する装置に関する。
DETAILED DESCRIPTION OF THE INVENTION Field: The present invention relates to an optical character reading device, and more particularly to a device having a reflex processing loop for recognizing inverted characters as well as normal characters.

〈従来技術と背景〉 いわゆる光学文字読取装置(OCR装置と通称す)は文
字図形等の機械認識のために用いられているが通常認識
する対象は実体部が黒の媒体」二に印刷あるいは手書き
された文字図形である。しかし、情報処理システムのデ
ータ入力媒介手段として考えると媒体である帳票のデー
タ表現形式としては見出しや強張表現のため特殊な表現
2例えば白黒反転した白ぬき文字や、少しずらして重ね
る2重打ち文字等を使用したいと云う要求がある。
<Prior art and background> So-called optical character reading devices (commonly referred to as OCR devices) are used for machine recognition of characters and figures, but the objects they usually recognize are printed or handwritten on media whose substantial parts are black. It is a character figure that has been However, when considered as a data input mediating means for an information processing system, the data expression format of the form that is the medium is a special expression 2 due to headings and strong expressions, such as white letters with black and white reversed, or double strokes that are slightly shifted and overlapped. There is a demand for using characters, etc.

そして媒体上にデータを書込む手段がいわゆるマトリッ
クスプリンタである場合上記正常のパターンと白ぬきの
パターンはパターン発生器の同一のパターンメモリの内
容で作成出来るので、比較的良く用いられている。
When the means for writing data on the medium is a so-called matrix printer, the normal pattern and the white pattern can be created using the same pattern memory contents of the pattern generator, and are therefore relatively frequently used.

一方OCR装置の立場でこれを認識対象として考えると
出現率は大して大きくないがまともに辞書を用意すると
少くとも2倍必要となるし認識のための照合時間も長く
なる。また読取ったイメージ情報でも反転(白ぬき)の
場合は地であるベタ黒の所に得てしてボイド(ノイズと
しての白)が出やすいが、これをぬりつぶす前処理は正
常のパターン情報に対する前処理とは逆になってしまう
On the other hand, when considering this as a recognition target from the standpoint of an OCR device, the appearance rate is not very large, but if a proper dictionary is prepared, at least twice as many dictionaries are required, and the collation time for recognition becomes longer. Also, when the read image information is inverted (white out), voids (white as noise) tend to appear on solid black areas, but the preprocessing to fill in these voids is different from the preprocessing for normal pattern information. It ends up being the opposite.

〈目的と特徴〉 本発明の目的は上記にかんがみ反転文字をつかまえた時
点でつかまえたパターンデータを反転処理して前処理部
に戻すことにより同一の前処理手段により前処理して同
一の辞書で反転文字を認識することであり本発明の特徴
は上記目的を読取ったイメージ情報を認識処理するため
の前処理部と、切出し部と、認識処理部とこれらのデー
タ加工及び手段制御を行う主制御部を有する光学文字読
取装置において、上記切出しに際し反転情報であること
を検出する手段と切出した情報を反転させる作業部を有
し反転情報であった場合には切出した情報を反転させて
再び前処理部に戻す様制御することである。
<Purpose and Features> In view of the above, the object of the present invention is to perform inversion processing on the pattern data caught at the time when an inverted character is caught and return it to the preprocessing section, thereby preprocessing it by the same preprocessing means and using the same dictionary. The purpose of the present invention is to recognize inverted characters, and the features of the present invention include a preprocessing unit for recognizing and processing the image information read for the above purpose, a cutting unit, a recognition processing unit, and a main controller that processes these data and controls the means. The optical character reading device has a means for detecting that the information is inverted when cutting out the information, and a working unit that inverts the cut out information, and if it is inverted information, the cut out information is inverted and read again. This is to control the return to the processing section.

〈実施例〉 第1図は本発明の一実施例の説明図である。図中1は帳
票、2は光源、3ば帳票から情報イメージ(原読取デー
タ)を読取る光学セレサ部、4は通常は切り出しを行う
とともに反転情報を検出する前特徴監視部、5は読取ら
れた情報イメージをイメージデータとして一事時記憶す
るとともに該イメージデータを2値化し、さらにいわゆ
るゴミ取り、スムージング等の認識処理のための前処理
をほどこし、加工したイメージデータを作業バッファ6
に供給する前処理部であり、7ば認識用の辞書、8は認
識作業部、9は作業バッファ6のイメージデータを反転
データに変換する反転処理部。
<Embodiment> FIG. 1 is an explanatory diagram of an embodiment of the present invention. In the figure, 1 is a form, 2 is a light source, 3 is an optical selector unit that reads the information image (original read data) from the form, 4 is a pre-feature monitoring unit that normally performs cutting and detects inverted information, and 5 is a unit that detects inverted information. The information image is temporarily stored as image data, the image data is binarized, and further preprocessing for recognition processing such as so-called dust removal and smoothing is performed, and the processed image data is stored in the work buffer 6.
7 is a dictionary for recognition, 8 is a recognition working unit, and 9 is an inversion processing unit that converts the image data in the work buffer 6 into inverted data.

10は制御情報としての帳票1のフォーマット情報を格
納する制御メモリ、11は認識され編集された処理ずみ
の認識データを収納するためのデータメモリ、12は認
識されたデータの編集とこれらの手段制御を行うための
主制御部であり、太線はデータの流れ細線は制御情報の
対応を示す。
10 is a control memory for storing format information of form 1 as control information; 11 is a data memory for storing recognized and edited processed recognition data; 12 is for editing recognized data and controlling these means. This is the main control unit for performing the following: The thick line indicates the flow of data, and the thin line indicates the correspondence of control information.

また第2図と第3図と第4図は第1図の補助図で第2図
は前処理部5の作業構成を説明するものであり、第3図
は処理の流れを説明するためのものである。また第4図
は前特徴監視部4における切出し時における前特徴判定
作業例のフローである。
2, 3, and 4 are auxiliary diagrams of FIG. 1, and FIG. 2 is for explaining the working configuration of the preprocessing section 5, and FIG. 3 is for explaining the processing flow. It is something. Further, FIG. 4 is a flowchart of an example of the previous feature determination work at the time of extraction in the previous feature monitoring unit 4.

図でも明らかな様に通常のイメージデータは前処理部5
で処理され作業バッファ6に格納され先の切出制御を兼
ねる監視部4で一文字づつ切り出されて認識作業部8で
辞書7の内容と参照され文字認識が成され認識された結
果は主制御部12で編集されデータメモリ11にたくわ
えられる。なお制御メモリ10内のフォーマット情報は
認識前のデータの切り出し制御や、認識後のデータの出
力フォーマントへの編集のため利用される。そしてここ
までの構成と作業の流れは従来のものと共通である。し
かして本実施例においては作業バッファ6からイメージ
データを切り出しながら切り出されたイメージデータを
監視する前特徴監視部4を持っておりこれにより例えば
切出されたイメージデータの外部りんかくが所定値より
大きい黒の比率が所定値より大きい、外部りんかくに凹
みがない、切れがない、と云ったきわめて簡単なカテゴ
リの特徴判定を行うことにより正常イメージデータでは
なくて反転イメージデータであることを検出する。そし
て該検出により反転イメージデータと判定されたものは
反転処理部9で白黒反転し再び前処理部5に戻され2値
化はすでに成されているのでゴミ取りとスムージングを
行い再び作業バッファ6に戻す。なおこの時にはイメー
ジデータはすでに反転されているので前特徴監視部4は
通常の切出し制御としてしか作動せず通常ルートでの認
識処理編集処理が行われ結果としてデータメモリ11に
出力されることになる。
As is clear from the figure, normal image data is processed by the preprocessing unit 5.
The characters are processed and stored in the work buffer 6, and are extracted one character at a time by the monitoring section 4, which also serves as extraction control.The recognition section 8 refers to the contents of the dictionary 7 to perform character recognition, and the recognized results are sent to the main control section. 12 and stored in the data memory 11. Note that the format information in the control memory 10 is used to control the extraction of data before recognition and to edit the data after recognition into an output format. The configuration and work flow up to this point are the same as the conventional one. However, this embodiment has a pre-feature monitoring section 4 which monitors the cut-out image data while cutting out the image data from the work buffer 6. It detects that the image data is not normal image data but is inverted image data by making feature judgments in very simple categories such as the ratio of large black is larger than a predetermined value, there are no dents or cuts in external links, etc. do. Then, what is determined to be inverted image data by this detection is inverted in black and white in the inversion processing unit 9, and returned to the preprocessing unit 5 again.Since the binarization has already been performed, it is subjected to dust removal and smoothing, and then transferred to the work buffer 6 again. return. By the way, since the image data has already been inverted at this time, the front feature monitoring section 4 operates only as normal extraction control, and the recognition processing and editing processing is performed in the normal route, and the result is output to the data memory 11. .

こうした構成を取れば同じ前処理部と従来と同じ辞書で
同じカテゴリ形式確認判定を行いながら正常形式のイメ
ージデータと反転形式のイメージデータが混在する場合
においても表現形式を気にせず認識することが出来る。
With this configuration, it is possible to perform the same category format confirmation judgment using the same preprocessing unit and the same dictionary as before, and to recognize images without worrying about the expression format even when normal format image data and inverted format image data are mixed. I can do it.

そして当然戻りループを経由させる場合の処理時間はか
んたんな判定でもそれなりに余分にかかるが通常のシス
テム運用形態としては強張のための表現の文字出現頻度
は当然のことながら通常の表現のための文字出現のため
の文字出現頻度に比して少いので、こうしたレフレック
スルート処理を行っても、全体としてのシステムアウト
プットの能力はそれほど低下せず、むしろハード構成が
簡単で実現コストが安く実現出来るメリットの方が注目
されるべきである。
Of course, the processing time required to go through the return loop will be extra, even for simple judgments, but as a normal system operation mode, the frequency of appearance of characters for expressions for emphatic expressions is naturally lower than that for normal expressions. Since the frequency of character appearance is low compared to the character appearance frequency, even if such reflex route processing is performed, the overall system output capacity does not deteriorate much, and the hardware configuration is simple and the implementation cost is low. The benefits that can be realized should receive more attention.

なおこうしたデータ処理および手順制御の各機能はマイ
クロプロセッサの内部プログラムにおきかえることは可
能である。
Note that these data processing and procedure control functions can be replaced with internal programs of the microprocessor.

なお本実施例のバリエーションとして前監視部りスルー
トの処理をやらせることも可能でありこ〈効 果〉 以上説明した様に本発明によれば辞書の容量も特に増さ
ずハード、あるいはマイクロプログラムもあまり増加さ
せることなく通常文字と反転文字を共通に認識出来る文
字認識装置が提供出来ると云う特徴ある効果を有するも
のである。
As a variation of this embodiment, it is also possible to have the pre-monitoring section handle the throughput processing. <Effects> As explained above, according to the present invention, the capacity of the dictionary does not particularly increase, and it is not necessary to use hardware or microprograms. This has the characteristic effect of providing a character recognition device that can commonly recognize normal characters and inverted characters without increasing the number of characters too much.

【図面の簡単な説明】[Brief explanation of drawings]

第1図は本発明の一実施例の説明図、第2図と第3図お
よび第4図は第1図の補足説明図である。 図中4は前特徴監視部、5は前処理部、6は作業バッフ
ァ、7は辞書、8は認識作業部、9は反転処理部、10
は制御メモリ、11はデータメモリ、12は主制御部、
5aは2値変換部、5bはゴミ取り部、5cはスムージ
ング部。 第3図 第4図
FIG. 1 is an explanatory diagram of one embodiment of the present invention, and FIGS. 2, 3, and 4 are supplementary explanatory diagrams of FIG. 1. In the figure, 4 is a pre-feature monitoring section, 5 is a pre-processing section, 6 is a work buffer, 7 is a dictionary, 8 is a recognition work section, 9 is an inversion processing section, 10
is a control memory, 11 is a data memory, 12 is a main control unit,
5a is a binary conversion section, 5b is a dust removal section, and 5c is a smoothing section. Figure 3 Figure 4

Claims (1)

【特許請求の範囲】[Claims] 読取ったイメージ情報を認識処理するための前処理部と
、切出し部と、認識処理部とこれらのデータ加工及び手
順制御を行う主制御部を有する光学文字読取装置におい
て、上記切出しに際し反転情報であることを検出する手
段と切出した情報を反転させる作業部を有し反転情報で
あった場合には切出した情報を反転させて再び前処理部
に戻す様制御することを特徴とする光学文字読取装置。
In an optical character reading device having a pre-processing section for recognition processing of read image information, a cutting section, a recognition processing section, and a main control section for processing these data and controlling procedures, the inverted information is used for the above-mentioned cutting. An optical character reading device comprising a means for detecting this and a working part for inverting the cut out information, and controlling the cut out information to be inverted and returned to the pre-processing part again if the cut out information is inverted information. .
JP58181120A 1983-09-29 1983-09-29 Optical character reader Pending JPS6073793A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP58181120A JPS6073793A (en) 1983-09-29 1983-09-29 Optical character reader

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP58181120A JPS6073793A (en) 1983-09-29 1983-09-29 Optical character reader

Publications (1)

Publication Number Publication Date
JPS6073793A true JPS6073793A (en) 1985-04-25

Family

ID=16095198

Family Applications (1)

Application Number Title Priority Date Filing Date
JP58181120A Pending JPS6073793A (en) 1983-09-29 1983-09-29 Optical character reader

Country Status (1)

Country Link
JP (1) JPS6073793A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0524797A2 (en) * 1991-07-23 1993-01-27 Canon Kabushiki Kaisha Image processing method and apparatus
WO2004097721A1 (en) * 2003-04-25 2004-11-11 Sharp Kabushiki Kaisha Image processing device, image processing method, image processing program, and computer-readable recording medium containing the program

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0524797A2 (en) * 1991-07-23 1993-01-27 Canon Kabushiki Kaisha Image processing method and apparatus
WO2004097721A1 (en) * 2003-04-25 2004-11-11 Sharp Kabushiki Kaisha Image processing device, image processing method, image processing program, and computer-readable recording medium containing the program

Similar Documents

Publication Publication Date Title
JP2940936B2 (en) Tablespace identification method
US6411733B1 (en) Method and apparatus for separating document image object types
JP2812982B2 (en) Table recognition method
JPS6073793A (en) Optical character reader
JP2633235B2 (en) OCR facsimile machine
JPH0290383A (en) Image scanner device
JPS6046471B2 (en) character reading device
JPS5949671A (en) Optical character reader
JP2894111B2 (en) Comprehensive judgment method of recognition result in optical type character recognition device
JPH04105178A (en) Document picture processor
JPH10171924A (en) Character recognizing device
JP4129902B2 (en) Ruled line erasing method, ruled line erasing apparatus, and recording medium
JPH05174178A (en) Character recognizing method
JP2001209755A (en) Device and method for correcting miswriting and computer readable recording medium with miswriting correction program stored therein
JPH04213179A (en) Character reader
JPH03217993A (en) Character size recognizer
JPH03149648A (en) Document processor
JPH0581469A (en) Pen type optical character recognition device
JPH06333089A (en) Optical character reader
JPH05189599A (en) Optical character reader
JPH01233586A (en) Printed character recognizing and editing system
JPH05189604A (en) Optical character reader
JPH03669B2 (en)
JPH04329492A (en) Character segmenting method
JPH06215187A (en) Optical character reader