JPS6027436B2 - Character recognition correction method - Google Patents

Character recognition correction method

Info

Publication number
JPS6027436B2
JPS6027436B2 JP54065747A JP6574779A JPS6027436B2 JP S6027436 B2 JPS6027436 B2 JP S6027436B2 JP 54065747 A JP54065747 A JP 54065747A JP 6574779 A JP6574779 A JP 6574779A JP S6027436 B2 JPS6027436 B2 JP S6027436B2
Authority
JP
Japan
Prior art keywords
character
recognition
line
memory
line mark
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired
Application number
JP54065747A
Other languages
Japanese (ja)
Other versions
JPS55157074A (en
Inventor
修司 汐崎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to JP54065747A priority Critical patent/JPS6027436B2/en
Publication of JPS55157074A publication Critical patent/JPS55157074A/en
Publication of JPS6027436B2 publication Critical patent/JPS6027436B2/en
Expired legal-status Critical Current

Links

Description

【発明の詳細な説明】 本発明は帳票の文字行の懐きをメモリ上において文字認
識する際に文字中心位置を補正して読取るようにした文
字認識修正方式に関するものである。
DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a character recognition correction method that corrects the center position of a character when character recognition is performed on a memory to read the character lines of a form.

従来、光学文字読取装置(以下OCRという)の走査器
により帳票の1行分の文字パターンをビデオデータとし
て検出しメモリ上に記憶保持し、このメモリ上の文字パ
ターンにつき文字の切出しと認識を行なう文字認識方式
が多用されている。
Conventionally, a scanner of an optical character reader (hereinafter referred to as OCR) detects a character pattern for one line of a form as video data, stores it in memory, and cuts out and recognizes the character pattern on this memory. Character recognition methods are often used.

この場合、第1図に示すように、OCR装置の光学的走
査器による走査面1に対し帳票2が方向5で紙送りされ
てラインマーク3をセンサで検出してセットされた時、
図に示すように斜に煩いたままセットされることがある
。帳票2の検出されるべき文字行4はその中心線が両側
の対応するラインマーク3・,32 を通すPP′線上
、すなわち帳票の両側エッジに直交するように記載され
ている。従って帳票2の送り方向5に直交する走査方向
PQとはある角度8をなしている。この関係において走
査器により走査されたビデオデータがメモリに格納され
、この格納された文字行の文字パターンPP′に対し第
1図と同じ関係において走査方向PQで文字の切出しと
認識が行なわれる。通常、帳票2の走査のための設定は
片側ラインセンサ3,のみで行ない、この検出における
停止位置ずれがある範囲内であれば、認識不可能時の再
走査をさらに行ない、その認識論取りの際に位置ずれの
補正が実施されている。しかし、帳票2の傾きによる角
度8のずれに対しては何等補正が行なわれていないから
、この種のずれの場合、とくに左端に近い方の文字パタ
ーンの認識では多くのリジェクト文字の発生が起る。こ
のようなりジェクトされる文字パターンを仔細に検討す
ると、第2図に示すように、メモリ上の認識の際走査方
向PQに対し帳票2の正常なセットの場合の文字パター
ンを点線枠8で示すと、帳票2に煩きがある場合には、
角度のこ対応する文字中心の上下ずれと僅かの回転ずれ
が存在する。
In this case, as shown in FIG. 1, when the form 2 is fed in the direction 5 to the scanning surface 1 by the optical scanner of the OCR device and the line mark 3 is detected by the sensor and set,
As shown in the figure, it may be set in a slanted position. The character line 4 to be detected on the form 2 is written so that its center line is on the line PP' passing through the corresponding line marks 3, 32 on both sides, that is, perpendicular to both edges of the form. Therefore, it forms a certain angle 8 with the scanning direction PQ, which is orthogonal to the feeding direction 5 of the form 2. In this relationship, the video data scanned by the scanner is stored in the memory, and characters are cut out and recognized in the scanning direction PQ in the same relationship as in FIG. 1 with respect to the stored character pattern PP' of the character line. Normally, the setting for scanning the form 2 is performed only with the line sensor 3 on one side, and if the stop position deviation in this detection is within a certain range, rescanning is further performed when recognition is not possible, and when determining the epistemology. The positional deviation has been corrected. However, since no correction is made for the deviation of angle 8 due to the inclination of form 2, in the case of this type of deviation, many reject characters occur, especially when character patterns near the left edge are recognized. Ru. If we examine in detail the character pattern that is ejected in this way, as shown in Figure 2, the character pattern in the case of a normal set of form 2 in the scanning direction PQ during recognition in memory is shown by a dotted line frame 8. If you are having trouble with Form 2,
There is a vertical shift of the center of the character corresponding to the angle saw and a slight rotational shift.

従って文字中心の上下ずれを補正すると、図に重複して
実線枠7で示すように僅かの回転ずれしか残らない。当
然文字6にも僅かの回転ずれが表われるが、角度aが余
り大きくない範囲ならば十分認識可能である。このよう
な補正によりこの種のずれに起因する多くのリジェクト
文字を救済できる筈である。本発明の目的は帳票の文字
行の煩きに査くりジェクトを文字認識において有効に救
済できる文字認識修正方式を提供することである。
Therefore, when the vertical shift of the center of the character is corrected, only a slight rotational shift remains as shown by the solid line frame 7 in the figure. Naturally, a slight rotational shift also appears in the character 6, but it is sufficiently recognizable as long as the angle a is not too large. Such correction should be able to repair many rejected characters caused by this type of shift. SUMMARY OF THE INVENTION An object of the present invention is to provide a character recognition correction method that can effectively correct the irregularity of character lines in a form and effectively correct the problem in character recognition.

前記目的を達成するため、本発明の文字認識修正方式は
光学文字論取装置の走査器で検出した帳票の1行分の文
字パターンをメモリに格納し該メモリ上で文字の認識議
取りを行なう文字認識方式において、帳票の一端に設け
られた一端ラインマークを検出する手段と、帳票の他端
に設けられた他端ラインマークを検出する手段と、前記
一端ラインマークと池端ラインマークの位置ずれを検出
する比較検出手段と、前記比較検出手段により検出され
た位置ずれ量をあらかじめ定められている1行分の文字
数で除算し各文字当りの懐きによるずれ量を算出する除
算手段と、前記1行分の文字パターンをメモリを格納し
、さらに上記他総ラインマークを検出した後における文
字認識処理時に、前記一端ラインマーク位置情報と前記
文字当りのずれ量情報と認識対象文字が何番目の文字で
あるかを示す情報とに基づいて認識対象文字の中心位置
補正量を各文字毎に算出する演算手段とを具え、前記文
字列に傾きのある場合、前記メモリ上に格納された文字
情報に対して前記中心位置補正量を各文字毎に与え、各
文字中心を順次各文字毎のずれ量だけ修正しつつ認識読
取りを行なうことを特徴とするものである。
In order to achieve the above object, the character recognition correction method of the present invention stores a character pattern for one line of a form detected by a scanner of an optical character discussion device in a memory, and performs character recognition discussion on the memory. In a character recognition method, means for detecting a line mark at one end provided at one end of a form, means for detecting a line mark at the other end provided at the other end of the form, and a positional deviation between the line mark at one end and the line mark at the other end. a comparison detection means for detecting the positional deviation, a division means for dividing the positional deviation amount detected by the comparison detection means by a predetermined number of characters for one line to calculate the deviation amount due to the pattern for each character; At the time of character recognition processing after storing the character pattern for a line in memory and detecting all the other line marks mentioned above, the one end line mark position information, the deviation amount information per character, and the number of the character to be recognized are determined. calculation means for calculating a center position correction amount for each character based on information indicating whether the character string is slanted; In contrast, the center position correction amount is applied to each character, and recognition and reading are performed while sequentially correcting the center of each character by the amount of deviation for each character.

以下本発明を実施例につき詳述する。The present invention will be described in detail below with reference to examples.

第3図は本発明の要部の構成と機能の説明図である。FIG. 3 is an explanatory diagram of the configuration and functions of the main parts of the present invention.

本発明の特徴は帳票の傾きに起因する前述のリジェクト
の問題を帳票と走査器はそのままとし、専らメモリ上で
認識の補正により解決したものである。その菱部として
認識装置のメモリ11を示す。同図において、OCR装
置の走査器10}こより帳票の1行分の文字パターンを
ビデオデータとしてメモリ11に格納する。
A feature of the present invention is that the above-mentioned rejection problem caused by the inclination of the form is solved by correcting the recognition exclusively in memory, while leaving the form and scanner as they are. The memory 11 of the recognition device is shown as the diamond part. In the figure, a character pattern for one line of a form is stored in a memory 11 as video data by a scanner 10 of an OCR device.

いま、文字パターンY12,〜1かが帳票の傾きのため
同図のように設定されたものとする。すなわち、第1図
の帳票2の右端のラインマーク3,に対応するマーク1
3,に設定され、この位置から走査が行なわれ、文字パ
ターンのビデオデータ12n〜12,が角度8で格納さ
れ左端のラインマーク32 に対応するマーク132が
検出される。このマーク131,132の位置をそれぞ
れa(ビット)、b(ビット)で表わすものとし、この
煩きに相当するビット差dを文字数nで除算し、各文字
当りの傾きによるずれ△を算出する。すなわち、△=は
ビット)三a(ビット)一芸 {・}を得、順次1又
字毎に上方または下方に文字中心を△だけ移動させて文
字パターンの認識議取りを行なうようにする。
Assume now that the character patterns Y12, -1 are set as shown in the figure because of the inclination of the form. That is, mark 1 corresponds to line mark 3 at the right end of form 2 in FIG.
3, and scanning is performed from this position, character pattern video data 12n to 12, are stored at an angle of 8, and a mark 132 corresponding to the leftmost line mark 32 is detected. Let the positions of these marks 131 and 132 be represented by a (bit) and b (bit), respectively, and divide the bit difference d corresponding to this inaccuracy by the number of characters n to calculate the deviation △ due to the slope for each character. . That is, △ = bit) 3a (bit) one trick {・} is obtained, and the character center is sequentially moved upward or downward by △ for each letter, and the recognition discussion of the character pattern is performed.

このようにして、第2図で説明したように正常な場合に
比し僅かに回転ずれを有する程度の文字パターンを認識
することになり、角度aが余り大きくない範囲では認識
が可能となる。
In this way, as explained in FIG. 2, a character pattern with a slight rotational deviation compared to the normal case can be recognized, and recognition is possible within a range where the angle a is not too large.

第4図は第3図のメモリ11を含む認識装置を具えた本
発明の実施例の説明図である。
FIG. 4 is an explanatory diagram of an embodiment of the present invention comprising a recognition device including the memory 11 of FIG.

同図において、ラインマークセンサ21により帳票の右
側のラインマークのa点の位置情報(アドレス)を検出
してレジスタ22に入れる。
In the figure, a line mark sensor 21 detects the positional information (address) of point a of the line mark on the right side of the form and stores it in a register 22.

この点が走査における最初の文字パターンの中心位置と
なる。このa点の位置情報を比較器24に入力するとと
もに、分岐して後述の加算器31に送る。また、第3図
に示したように走査器10により帳票の1行分の文字パ
ターンのビデオデータが認識装置29のメモリー1に格
納される。そして、走査が帳票の左端ラインマークのb
点に達するとラインマークセンサ23によりb点の位置
情報(アドレス)を検出して比較器24に入れ、前述の
式{1}における(b−a)を得る。この出力(b−a
)を除算器27に入れ、フオーマツトレジスタ25より
取出してメモリ26に保持されている文字個数nにより
除算を行ない、各文字当りの中心のずれ量を求める。一
方、ラインマークセンサ23のb点検出と同時にスター
ト信号を認識装置29に送り、メモリ11上で1字づつ
文字再認識が開始され、1字毎に中心位置を移動する。
This point becomes the center position of the first character pattern in the scan. This positional information of point a is input to the comparator 24, and is branched and sent to an adder 31, which will be described later. Further, as shown in FIG. 3, video data of a character pattern for one line of a form is stored in the memory 1 of the recognition device 29 by the scanner 10. Then, scan the leftmost line mark b of the form.
When the point is reached, the position information (address) of point b is detected by the line mark sensor 23 and input to the comparator 24 to obtain (ba) in the above equation {1}. This output (b-a
) is input into the divider 27 and divided by the number n of characters retrieved from the format register 25 and held in the memory 26 to find the center shift amount for each character. On the other hand, at the same time as point b is detected by the line mark sensor 23, a start signal is sent to the recognition device 29, character re-recognition is started character by character on the memory 11, and the center position is moved character by character.

そのため、認識が1字終る毎に計数する文字数カウンタ
30を設け、この計数値と除算器27の出ヵ(△=羊)
と榛算器28により乗算することにより、各文字パター
ンの中心のa点からのずれ量が算出される。
Therefore, a character counter 30 is provided that counts each character after each character is recognized, and this count value and the output of the divider 27 (△=sheep) are provided.
By multiplying by the multiplier 28, the amount of deviation from the center point a of each character pattern is calculated.

従って、この乗算器28の出力を加算器31に入力し、
前述のa点の位置情報に加算すれば各文字の中心位置の
補正値が得られる。これをレジス夕32に保持して認識
装置29の各文字認識の際の中心として供給すれば、僅
かの回転ずれのみで殆どが認識可能となるものである。
以上説明したように、本発明によれば、帳票の文字行の
煩きをメモリ上において文字認識をする際に文字中心位
置をそれぞれ補正して謙取るようにすることにより、余
り大きくない額きならば殆ど認識が可能となり、従来こ
の種のずれの場合多〈のりジェクト文字が発生していた
のに対し大部分を救済することができる。
Therefore, the output of this multiplier 28 is input to the adder 31,
By adding this to the position information of point a described above, a correction value for the center position of each character can be obtained. If this is held in the register 32 and supplied as a center for each character recognition by the recognition device 29, most of the characters can be recognized with only a slight rotational deviation.
As explained above, according to the present invention, by correcting the center position of each character when character recognition is performed in memory in order to reduce the trouble of character lines on a form, In this case, most of the characters can be recognized, and most of the characters can be repaired, whereas in the past, this type of discrepancy often resulted in overlapping characters.

その理由は帳票の傾きが余り大きい場合には正常な紙送
りとは見なされず別途除去されるからである。このよう
にして文字認識におけるリジェクト文字の減少に寄与す
るところが大きく、文字認識の効率を高めることができ
る。
The reason for this is that if the inclination of the form is too large, it will not be considered normal paper feeding and will be removed separately. In this way, it greatly contributes to reducing the number of rejected characters in character recognition, and the efficiency of character recognition can be improved.

【図面の簡単な説明】 第1図はOCR装置における帳票走査の問題点の説明図
、第2図は本発明の原理の概略説明図、第3図は本発明
の要部の構成と機能の説明図、第4図は本発明の実施例
の構成を示す説明図であり、図中、2は帳票、3,,3
2はラインマーク、4は文字行、10は走査器、11は
メモリ、122〜12nは文字パターン、131,13
2は検出マーク、21,23はラインマークセンサ、2
2,32はしジスタ、24は比較器、25はフオーマツ
トレジスタ、26はメモリ、27は除算器、28は乗算
器、29は認識装置、30は文字数カウンタ、31は加
算器を示す。 第1図 第2図 図 M 船 図 寸 船
[Brief Description of the Drawings] Fig. 1 is an explanatory diagram of the problem of document scanning in an OCR device, Fig. 2 is a schematic explanatory diagram of the principle of the present invention, and Fig. 3 is an illustration of the configuration and functions of the main parts of the present invention. Explanatory diagram, FIG. 4 is an explanatory diagram showing the configuration of an embodiment of the present invention, in which 2 is a form, 3, 3
2 is a line mark, 4 is a character line, 10 is a scanner, 11 is a memory, 122 to 12n are character patterns, 131, 13
2 is a detection mark, 21 and 23 are line mark sensors, 2
2 and 32 are registers, 24 is a comparator, 25 is a format register, 26 is a memory, 27 is a divider, 28 is a multiplier, 29 is a recognition device, 30 is a character number counter, and 31 is an adder. Figure 1 Figure 2 Figure M Ship diagram and dimensions

Claims (1)

【特許請求の範囲】[Claims] 1 光学文字読取装置の走査器で検出した帳票の1行分
の文字パターンをメモリに格納し該メモリ上で文字の認
識読取りを行なう文字認識方式において、帳票の一端に
設けられた一端ラインマークを検出する手段と、帳票の
他端に設けられた他端ラインマークを検出する手段と、
前記一端ラインマークと他端ラインマークの位置ずれを
検出する比較検出手段と、前記比較検出手段により検出
された位置ずれ量をあらかじめ定められている1行分の
文字数で除算し各文字当りの傾きによるずれ量を算出す
る算出手段と、前記1行分の文字パターンをメモリに格
納し、さらに上記他端ラインマークを検出した後におけ
る文字認識処理時に、前記一端ラインマーク位置情報と
前記文字当りのずれ量情報と認識対象文字が何番目の文
字であるかを示す情報とに基づいて認識対象文字の中心
位置補正量を各文字毎に算出する演算手段とを具え、前
記文字列に傾きのある場合、前記メモリ上に格納された
文字情報に対して前記中心位置補正量を各文字毎に与え
、各文字中心を順次各文字毎のずれ量だけ修正しつつ認
識読取りを行なうことを特徴とする文字認識修正方式。
1 In a character recognition method in which a character pattern for one line of a form detected by a scanner of an optical character reading device is stored in a memory and characters are recognized and read on the memory, one line mark provided at one end of the form is means for detecting, and means for detecting the other end line mark provided at the other end of the form;
A comparison detection means for detecting a positional deviation between the one end line mark and the other end line mark, and a slope for each character obtained by dividing the amount of positional deviation detected by the comparison detection means by a predetermined number of characters for one line. and a calculation means for calculating the amount of deviation due to the one-line mark position information and the one-line mark position information during character recognition processing after storing the character pattern for one line in a memory and detecting the other-end line mark. calculation means for calculating the center position correction amount of the recognition target character for each character based on the shift amount information and the information indicating the number of the recognition target character, In this case, the center position correction amount is given to each character to the character information stored in the memory, and the recognition reading is performed while sequentially correcting the center of each character by the amount of deviation for each character. Character recognition correction method.
JP54065747A 1979-05-28 1979-05-28 Character recognition correction method Expired JPS6027436B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP54065747A JPS6027436B2 (en) 1979-05-28 1979-05-28 Character recognition correction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP54065747A JPS6027436B2 (en) 1979-05-28 1979-05-28 Character recognition correction method

Publications (2)

Publication Number Publication Date
JPS55157074A JPS55157074A (en) 1980-12-06
JPS6027436B2 true JPS6027436B2 (en) 1985-06-28

Family

ID=13295914

Family Applications (1)

Application Number Title Priority Date Filing Date
JP54065747A Expired JPS6027436B2 (en) 1979-05-28 1979-05-28 Character recognition correction method

Country Status (1)

Country Link
JP (1) JPS6027436B2 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS5818771A (en) * 1981-07-24 1983-02-03 Fujitsu Ltd Reading system of optical character reader
JPS59148988A (en) * 1983-02-14 1984-08-25 Yokogawa Hokushin Electric Corp Picture processor
JPH03118680A (en) * 1989-09-30 1991-05-21 Anritsu Corp Character recognizing device

Also Published As

Publication number Publication date
JPS55157074A (en) 1980-12-06

Similar Documents

Publication Publication Date Title
JPH08123900A (en) Method and apparatus for decision of position for line scanning image
US6912325B2 (en) Real time electronic registration of scanned documents
JPS6027436B2 (en) Character recognition correction method
JPH05258146A (en) Correction device for oblique running data of paper sheet or the like
JPS6033332B2 (en) Information input method using facsimile
JPS6343789B2 (en)
JPH0782524B2 (en) Optical character reader
JP7310151B2 (en) Mark selection device and image processing device
JPH07192087A (en) Optical character reader
JP2608943B2 (en) Optical mark reading method
JP3544779B2 (en) Method and apparatus for extracting image data of paper sheets
JPH0221385A (en) Printer
JPS6325388B2 (en)
JPH05174184A (en) Optical character reader
JPS6156657B2 (en)
JPS61286983A (en) System for correcting inclination of character pattern data
JPS59206987A (en) Letter recognizing device
JPH0340430B2 (en)
JPH0235869A (en) Picture reader
JPS6029881A (en) Optical character reader
JPH04119480A (en) System for deciding effective four sides of rectangular paper sheets
JPS5880779A (en) Character reader
JPS59191680A (en) Optical character reader
JPH0896074A (en) Character recognizing device
JPH0632077B2 (en) Figure recognition device