JPS5878277A - Optical character reader - Google Patents

Optical character reader

Info

Publication number
JPS5878277A
JPS5878277A JP56177721A JP17772181A JPS5878277A JP S5878277 A JPS5878277 A JP S5878277A JP 56177721 A JP56177721 A JP 56177721A JP 17772181 A JP17772181 A JP 17772181A JP S5878277 A JPS5878277 A JP S5878277A
Authority
JP
Japan
Prior art keywords
character
data
recorded
area
pattern
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP56177721A
Other languages
Japanese (ja)
Inventor
Masahiro Zaisho
税所 正博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Toshiba Corp
Original Assignee
Toshiba Corp
Tokyo Shibaura Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Toshiba Corp, Tokyo Shibaura Electric Co Ltd filed Critical Toshiba Corp
Priority to JP56177721A priority Critical patent/JPS5878277A/en
Publication of JPS5878277A publication Critical patent/JPS5878277A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/24Character recognition characterised by the processing or recognition method
    • G06V30/242Division of the character sequences into groups prior to recognition; Selection of dictionaries

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Character Discrimination (AREA)

Abstract

PURPOSE:To read efficiently data of a form where data are recorded in various character types, by selecting a prescribed dictionary in accordance with the character type designating code recorded on the form to recognize characters of recorded data. CONSTITUTION:A form 10 is provided with a read data recording area 11, where data to be read is recorded in one of plural kinds of character types which are determined preliminarily, and a character type designation information recording area 12 where character type designating information peculiar to the character type of data recorded in the area 11 is recorded. The pattern of the area 12 of the form 10 is read in a photoelectric converting part 105, and next, the pattern of the area 11 is read. This character pattern data is subjected to pattern recognition in one-character units in a character recognizing part 107, and a prescribed dictionary in a dictionary 104 is refered to recognize the pattern of the area 11. The character code resulting from this pattern recognition is stored in an answer buffer 108 successively.

Description

【発明の詳細な説明】 本発明は記録データの字体を異にする複数種の帳Jl類
を取扱うことのできる光学的文字読取9kmに関する。
DETAILED DESCRIPTION OF THE INVENTION The present invention relates to an optical character reading system (9km) that can handle multiple types of books Jl type in which the fonts of recorded data are different.

従来の光学的文字読取装置(以下00Rと称t)4mお
い【は、フォーマットコントロール情報(以下FCと称
す)によって予め指定された字体に従い、帳票上に記録
(印字又は白駒)された活字の読取りを行なっている。
The conventional optical character reading device (hereinafter referred to as 00R) reads the printed characters recorded (printed or blanked) on a form according to the font specified in advance by format control information (hereinafter referred to as FC). is being carried out.

従って従来では、h庫るべき複数枚の帳票それぞれに記
録された活字が上記FCにより指定された字体(:統一
されている所鋼単−字体の帳票読取り時−二おいては、
高い読取り精度をもってx字1&!!!識を行なうこと
ができるが、上記PCで指定された字体以外の活字が記
録されている帳票を含んだ所謂混在字体の帳票読取り時
C:おいては、読取りvPIt!Lの大幅な低下を招く
という欠点があった。
Therefore, conventionally, the typefaces recorded on each of the plurality of forms to be stored are in the font specified by the FC.
x character 1 & with high reading accuracy! ! ! However, when reading a form with a so-called mixed font including a form in which fonts other than those specified by the PC are recorded, the reading vPIt! This had the disadvantage of causing a significant decrease in L.

又、近年では、蘭易形タイプライタ勢の普及−一伴い、
ラインプリンタ等、特定の印字装置で紀−した活字のみ
でなく、他の印字装置で印字した1字体を異(ニする複
数種の帳票を取扱うことのできるOORの出現が強く望
まれていた。
In addition, in recent years, with the spread of orchid typewriters,
There has been a strong desire for an OOR that can handle not only type printed with a specific printing device such as a line printer, but also multiple types of forms printed with different fonts printed with other printing devices.

本発明は上記実情に鑑みなされたもので、記録データの
字体を異1:する複数種の帳票の混在読取りを高い読取
り精度をもって読取ることのできる光学的X字読摩装置
を提供することを目的とする。
The present invention was made in view of the above-mentioned circumstances, and an object of the present invention is to provide an optical X-shaped reading device that can read a mixture of multiple types of forms in which the fonts of recorded data are different with high reading accuracy. shall be.

以下回向を参照して本発明の一実施例を説明する。第1
図は本発明の一実施例1:おいて用いられる帳票のフォ
ーマット例を示す図である。
An embodiment of the present invention will be described below with reference to the following. 1st
The figure shows an example of the format of a form used in Embodiment 1 of the present invention.

図中、10はOORの読取り対象となる帳票。In the figure, 10 is a form to be read by OOR.

11は読取るべきデータが印字(又は印刷)される読増
データ紀録領域(以下データエリアと称す)である、1
2はこのデータエリア11(二記録されたデータ(活字
)の字体を指定するための字体指定コードが記録される
字体物足情報記録領域(以下字体コードエリアと称す)
vtヮあり、この@系10を増扱うOORによってMi
lliすることのできる字体及び字&(二より記録され
る。ここでは3桁の数字コードによって、その帳票10
のデータエリア11(二対する字体指定コードが記録さ
れるものとする。
11 is a reading data log area (hereinafter referred to as data area) where data to be read is printed (or printed);
2 is this data area 11 (2) A font/object information recording area in which a font designation code for specifying the font of recorded data (printed characters) is recorded (hereinafter referred to as the font code area).
There is vtヮ, and Mi
The fonts and characters that can be used are recorded from 2.Here, the 3-digit numerical code
data area 11 (in which two font designation codes are recorded).

第2図は本発明の一実tMyu二おけるOORの内部回
路構成要素を示すブロック図である0図中、1’01は
001’L全体の制御を司るマイクロプロセッサ(以下
μm0PUと称す)であり、102はこのμm0PUJ
OJの制御用マイクロプログラムが格納されるROM、
703は上記μm0PUJ OXのワーク領域婢として
使用されるRAMである。104はこのOORで取扱い
n−+能な字体の11:応じた辞書(以下り−FO−N
Tと称す)J 04..104..104゜・・・10
4mを持ち、それぞれのD−FONT104゜、104
1,104−・・・104mにそれぞれ固有の字体(=
よるX字關識データが格納された文字−織辞書メモリ(
以下DPMと称す)である。
Figure 2 is a block diagram showing the internal circuit components of the OOR in the embodiment of the present invention. , 102 is this μm0PUJ
ROM in which OJ control microprogram is stored;
A RAM 703 is used as a work area for the μm0PUJ OX. 104 is handled by this OOR. 11: Dictionary of n-+ possible fonts (hereinafter -FO-N)
(referred to as T) J 04. .. 104. .. 104°...10
4m, each D-FONT104゜, 104
1,104-... Each 104m has a unique font (=
A character-textile dictionary memory in which X-character recognition data is stored (
(hereinafter referred to as DPM).

105は帳$10の文字パターンを読取る光−変換部、
106はこの光電変換部105で読取られたパターンデ
ータな行単位をもって貯えるラインバッファである。1
01はラインノくツフアJOgl:貯えられたパターン
データを受け。
105 is an optical conversion unit that reads the character pattern of the book $10;
A line buffer 106 stores pattern data read by the photoelectric conversion unit 105 in units of rows. 1
01 is Line Nokutuhua JOgl: Receives stored pattern data.

文字の切出しく二より得られるl¥:字単位のX字パタ
ーンデータを順次x字ili!!議するX字蛯一部であ
る。xorRはこのX字認識部107で読取った字体コ
ードエリア12の内容を貯える字体コードレジスタであ
り、この字体コードレジスタ1orRの内容(=従い、
X字間職部101が帳票10のデータエリア11のX字
昭一時幅:おいて81M104印の一つのD−FONT
1041を選択指定する。すなわち、字体コードレジス
タ101Rの同各は1文字替−躯JOYが帳票10のデ
ータエリア11i=記録されたデータを文字amする際
のDFMJ74P’i(:、:11けるD−FONT指
票値として用いられる。尚、字体コードエリア12の文
字gmm直値おいては、X字&!iwA部101が予め
定められた数字のD−FONT(ここでは104・とす
る)を参M’して数字パターンの!i!!!IIIを行
ない、その数値データを上記字体コードレジスタ10r
Bi:貯′える。
l\\ obtained by cutting out characters: X-character pattern data for each character is sequentially x-character ili! ! This is a part of the X-shaped character that will be discussed. xorR is a font code register that stores the contents of the font code area 12 read by this X character recognition unit 107, and the contents of this font code register 1orR (=therefore,
Between the X character and the data area 11 of the form 10, the width of the X character is 81M104, which is one D-FONT.
Select and specify 1041. In other words, each of the same characters in the font code register 101R is the D-FONT index value when DFMJ74P'i (:, :11 is subtracted) when the data area 11i of the form 10 is recorded. In addition, in the direct value of the character gmm in the font code area 12, the X character &! Execute !i!!!III of the pattern and store the numerical data in the font code register 10r.
Bi: Save.

108はx字昭一部101で1&!!織処理されたデー
タすなわち答えを貯える答バッファ、109は外部装置
との間でデータを送受するための外部インターフェイス
部である。、1 ’10はデータアドレス、及びコント
ロール情報の転送に供されるシステムバスである。
108 is 1 &! ! An answer buffer 109 for storing processed data, that is, answers, is an external interface unit for transmitting and receiving data to and from an external device. , 1'10 is a system bus used for transferring data addresses and control information.

ここで−実権例の動作を説明する。00Rで読取るべき
帳$10には、データエリア11嘔二記録されたデータ
の字体に固有の字体物足コードが予め字体コードエリア
12(=記録(記入又は印字又は印刷)される、この帳
票10は図示しない帳票畿送機構シーより読取走査部(
二搬送され、μm0PUI 01の制御の下に、光電変
換部105I:て、先ず字体コードエリア12のパター
ンの読取りが行なわれ、更ζ−続いてデータエリア11
のパターンの読取りが行なわれる。
Now, the operation of the real example will be explained. In the book $10 to be read with 00R, a font code unique to the font of data recorded in the data area 11 is preliminarily recorded in the font code area 12 (=recorded (written or printed) in this form 10. is a reading scanning unit (not shown) from the document feed mechanism seat (
Under the control of μm0 PUI 01, the photoelectric conversion unit 105I first reads the pattern in the font code area 12, and then reads the pattern in the data area 11.
The pattern is read.

この光電変換部105で読取られたx字パターンデータ
は行単位をもってラインバッファ1゛06に貯えられた
後、IX文字位をもって文字認識部107(二送られ、
パターン統御される。ここでx字詰線部107は字体コ
ードエリア12Q)9字パターンデータを堂;すると、
071M104(ハ)の予め足められた特定のD−FO
NT104゜を参照して上記数字パターンの文字認識を
行なう、而して文字認識された字体指定コ〒ドの数値デ
ータは字体コードレジスタJ67R4mラッチされる0
次C:、文字9IiIi1部101はデータエリア11
のパターンデータな受けると、字体コードレジスタ10
7Rの内容C−従う%DFM104円の一つのD−FO
NTJ 041を辿び。
The x-shaped pattern data read by the photoelectric converter 105 is stored in the line buffer 1'06 in units of lines, and then sent to the character recognition unit 107 (2) at the IX character position.
Pattern controlled. Here, the x-shaped line part 107 inputs the 9-character pattern data in the font code area 12Q;
Pre-added specific D-FO of 071M104 (c)
Character recognition of the above numeric pattern is performed with reference to NT104°, and the numeric data of the recognized font designation code is latched to 0 in the font code register J67R4m.
Next C:, character 9IiIi1 part 101 is data area 11
When receiving the pattern data, the font code register 10
Contents of 7R C-1 D-FO of % DFM 104 yen according to
Follow NTJ 041.

そのD−FONT 1’041を参照してデータエリア
11のパターンll1l#11を行なう、このデータエ
リア11のパターン認識(:よる答えデータ(x字コー
ド)はシステムパス110を介して順次答バッファ10
8(=貯えられる。このような帳票単位の字体指定によ
る文字g識動作が6帳#410・・・の峻取り毎(:繰
返し行なわれる。
The pattern ll1l#11 of data area 11 is performed with reference to the D-FONT 1'041.
8 (= is stored. Such a character g recognition operation by specifying the font for each form is repeated every time the 6 books #410... are sharpened.

上述°の如く、6帳w410・・・の読取り時礁二おい
て、その帳票10I′−記録された字体指定コード(=
従いD−FONTJ 041を選択して、同−帳票10
上(′″−1−1記録データのx字誌−を行なう構成と
したことにより、記録データの字体(又はフォント)を
異(ニする多種の@票の混在飲取りが可訃となり、かつ
読取り精度を著しく同上できる。
As mentioned above, when reading the 6 forms w410..., the font 10I' - recorded font designation code (=
Therefore, select D-FONTJ 041 and create the same form 10.
By configuring the above ('''-1-1 The reading accuracy can be significantly improved.

以上詳記したように本発明域=よれば、記録データの字
体を異にする複tIIiaの帳票の混在読取りを高い読
摩り精度をもって実行することのできる光学的文字読取
装置が提供できる。
As described in detail above, according to the present invention, it is possible to provide an optical character reading device that can perform mixed reading of multiple tIIia forms in which recorded data have different fonts with high reading accuracy.

【図面の簡単な説明】[Brief explanation of the drawing]

弗1図及び第2図は本発明の一実施例を説明するための
もので、IJ1図は帳票のフォーマット例を示す図51
11%2図はOOR装置内部の構成を示すブロック図で
ある。 10・・・帳票、11・・・読取データ記録領域(デー
タエリア)、”12・・・字体指定情報記録領域(字体
コードエリア)、101・・・マイクロプロセッサ(μ
m0’PU)、707・・・I’LOM。 103・・・RAM、J04・・・X字認識辞書メモリ
(DFM )、104..104..104@=・10
4m・・・辞書(D−FONT)、105・・・光電変
換部、106・・・ラインバッファ% Iol・・・’
y字g−m、101R・・・字体コードレジスタ、1o
8・・・答バ、ツファ。 出−人代理人 弁理土鈴江武彦 第1図 第2図 04 ■
Figures 1 and 2 are for explaining one embodiment of the present invention, and Figure IJ1 is Figure 51 showing an example of the format of a form.
Figure 11%2 is a block diagram showing the internal configuration of the OOR device. 10... Form, 11... Read data recording area (data area), ``12... Font designation information recording area (font code area), 101... Microprocessor (μ
m0'PU), 707...I'LOM. 103...RAM, J04...X character recognition dictionary memory (DFM), 104. .. 104. .. 104@=・10
4m...Dictionary (D-FONT), 105...Photoelectric conversion unit, 106...Line buffer% Iol...'
Y character g-m, 101R... font code register, 1o
8...Answer, Tsufa. Representative Patent Attorney Takehiko Suzue Figure 1 Figure 2 04 ■

Claims (1)

【特許請求の範囲】 予め定められた複数種の字体のうちの任意の字体で被獣
増データが記録された読取データ記録領域とこの読取デ
ータ記録領域(二記録されたデータの字体(=固有の字
体指定情報が記録された字体指定情報記録領域とを有し
てなる帳票と。 前記複数棟の字体それぞれに固有のX字m@辞書と、前
記帳票の字体指定情報記録領域1二記録された字体指定
情報の読取り結果に従い、#記複数のx字g−辞書から
、その字体指定情報により指定された字体1=固有のX
字@−辞書を選択する辞書選択手段、及び選択された文
字認−辞蕾を用いて前記帳票の読取データ記録領域に記
録されたデータのXF認−を行なうM一部とを具備して
なることを特徴とする光学的文字読XI2装置。
[Scope of Claims] A read data recording area in which animal increase data is recorded in an arbitrary font out of a plurality of predetermined fonts; a font designation information recording area in which font designation information of the plurality of fonts is recorded; According to the reading result of the font designation information, from the # mark multiple x characters g - font 1 = unique X specified by the font designation information
A dictionary selection means for selecting a character @ dictionary, and a part M for performing XF verification of data recorded in the read data recording area of the form using the selected character recognition dictionary. An optical character reading XI2 device characterized by:
JP56177721A 1981-11-05 1981-11-05 Optical character reader Pending JPS5878277A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP56177721A JPS5878277A (en) 1981-11-05 1981-11-05 Optical character reader

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP56177721A JPS5878277A (en) 1981-11-05 1981-11-05 Optical character reader

Publications (1)

Publication Number Publication Date
JPS5878277A true JPS5878277A (en) 1983-05-11

Family

ID=16035944

Family Applications (1)

Application Number Title Priority Date Filing Date
JP56177721A Pending JPS5878277A (en) 1981-11-05 1981-11-05 Optical character reader

Country Status (1)

Country Link
JP (1) JPS5878277A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2596896A1 (en) * 1986-04-03 1987-10-09 Gachot Sa Optical reading method, and table of typical characters relating to it
EP0692768A3 (en) * 1994-07-15 1997-05-02 Horst Froessl Full text storage and retrieval in image at OCR and code speed

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS5220728A (en) * 1975-08-11 1977-02-16 Nec Corp Character reader
JPS5680788A (en) * 1979-12-05 1981-07-02 Fujitsu Ltd Character recognition system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS5220728A (en) * 1975-08-11 1977-02-16 Nec Corp Character reader
JPS5680788A (en) * 1979-12-05 1981-07-02 Fujitsu Ltd Character recognition system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2596896A1 (en) * 1986-04-03 1987-10-09 Gachot Sa Optical reading method, and table of typical characters relating to it
EP0692768A3 (en) * 1994-07-15 1997-05-02 Horst Froessl Full text storage and retrieval in image at OCR and code speed

Similar Documents

Publication Publication Date Title
US4562304A (en) Apparatus and method for emulating computer keyboard input with a handprint terminal
US4566039A (en) Facsimile system
GB2238640A (en) Multiple-bus controller for printer
JPS631618B2 (en)
JPS5878277A (en) Optical character reader
EP0009662B1 (en) Method and apparatus for storing and reconstructing chinese-like characters
JPH0330977A (en) Page printer control system
EP0072708B1 (en) Printer
JPH06343115A (en) Printer device and facsimile device capable of printing and reading bar code
JPS62255993A (en) Image output unit
JPH01150568A (en) Printer device
JPH0621978B2 (en) Print control device
JP2529421B2 (en) Character recognition device
JPH0347766A (en) Serial dot printer
KR950004219B1 (en) Method and apparatus for font storage
JPS6042086A (en) Printer
JPS58201674A (en) Method for registering and printing special pattern
JPS63262262A (en) Printer device
JPS62251884A (en) Recorder
JPS58195946A (en) Word processor
JPH02141797A (en) Character pattern generating device
JPH01228875A (en) Printer
JPS63309956A (en) Phototype setting system
JPH0499665A (en) Printer
JPH11305981A (en) Image forming device