JPS5827276A

JPS5827276A - Optical character reader

Info

Publication number: JPS5827276A
Application number: JP56126342A
Authority: JP
Inventors: Masaki Komiya; 小宮　雅紀
Original assignee: Toshiba Corp; Tokyo Shibaura Electric Co Ltd
Current assignee: Toshiba Corp
Priority date: 1981-08-12
Filing date: 1981-08-12
Publication date: 1983-02-17

Abstract

PURPOSE:To correct data without using an additional line or an additional field by entering an inserting mark and additional data for the data already entered in a form. CONSTITUTION:When the scanning of a prescribed line is completed and the projection is taken, a recognizing part 6 scanns the contents of a vertical projection part 4 and a horizontal projection part 5. If a projection of twice or more the normal character pitch and line pitch is detected, a detecting means for detecting a mark for insertion included in the recognizing part 6 is driven to check whether the detected mark is a mark for insertion or not. Subsequently the recognizing part 6 scanns the inside of a picture memory part 2 and detects the position of X and Y coordinates of the highest point out of a mark for insertion 13. The marks 13 and an additional character line 14 are masked, the quantized patterns of the character lines 12 is cut out character patterns of the unit of one character to recognize them and the recognized results are successively stored in an unshown buffer in the recognizing part 6.

Description

【発明の詳細な説明】 ′この発明は、特に読取られるデータの修正方式全改良
した光学的文字読取装置に関する。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to an optical character reading device which has a completely improved method of modifying the data being read.

従来、光学的文字読取装置（以下ＯＣＲという）は、単
純な数字等のデータ入力用に使用されることが多かった
が、今後は読取字種の増加に伴ってテキスト（原稿）入
力用ＯＣＲの用途が拡大していくと考えられる。テキス
ト処理用の機器として、日本語、英文ワードプロセッサ
が普及しつつあるが、当然これに対するインプットマシ
ンとしてＯＣＲの石川が考えられる。この様なＯＣＲに
とって、リジェクトの処理は、ワードプロセッサ側の仕
事になるので不要となり、帳Ｍの形態も、データ人力の
場合のフィールド指定に比べ、フリーフォーマット（原
稿用紙、レター用紙〕にならざる全書ない。したがって
、その分人力データ針も増大し、記入もれ、記入ミスも
増加するので、データ入力ＯＣＲとは異なる修正方式が
必要である。−万、ワードプロセッサ側でのデータ修正
、編集も当然可能であるので、原稿作成時点で、ドロッ
プアウトカラーにて、ミスの個所、内容をマークしてお
けば、インジット後、ワードプロセッサ側での修正はで
きる。ＯＣＲ単体でも原稿記入時に、簡単な修正が行な
えれば便利である。In the past, optical character readers (hereinafter referred to as OCR) were often used for inputting data such as simple numbers, but in the future, as the types of characters that can be read increase, OCR for text (manuscript) input will be used. It is thought that its uses will continue to expand. Japanese and English word processors are becoming popular as text processing devices, and the OCR Ishikawa is naturally an ideal input machine for these devices. For such OCR, reject processing is a job on the word processor side, so it becomes unnecessary, and the form of Book M is also different from the field specification in the case of manual data processing, since it is a whole book that is not in a free format (manuscript paper, letter paper). Therefore, the number of human data inputs increases, and the number of omissions and errors also increases, so a correction method different from that of data input OCR is required. This is possible, so if you use the dropout color to mark the mistakes and their contents when creating the manuscript, you can make corrections in the word processor after inputting. Even with OCR alone, you can easily make corrections when writing the manuscript. It's convenient if you can do it.

しかしながら従来のＯＣＲ単体では、データの修正を行
なう場合、Ｉ４！２υ消し専用のフィールド（欄）全設
け、そこにマーク又はある特定の記号等を記入すること
により、関連する行又はフィールドのデータを消し込み
、別途用意されている予備フィールド、予１１ｉｎ行等
に新たなデータ全記入することによってデータの更新、
入れ換えを行なう方式、あるいは外部記憶装置にデータ
全転送した後でＣＲＴ−キーづ？−ドを利用して、内容
の変更全行なう方式がとられていた。However, with conventional OCR alone, when data is to be corrected, all fields (columns) dedicated to I4!2υ erasure are provided, and data in related lines or fields is edited by writing marks or specific symbols there. Update data by erasing, filling in all new data in separately prepared spare fields, preliminary 11 inch lines, etc.
Is there a way to replace the data, or transfer all the data to an external storage device and then use the CRT-key? The method used was to use the - code to make all changes to the contents.

この発明は上記のような事情に鑑みてなされたもので、
すでに帳票に記入されたデータに対して、挿入用記号と
追加データ全記入することで、追加性、追加フィールド
を用いるとと々く。This invention was made in view of the above circumstances,
You can use additionality and additional fields by filling in the insertion symbol and all additional data for the data that has already been entered in the form.

データの修正全行なうことができる光学的文字読取装置
！１を提供することを目的とする。An optical character reader that can perform all data corrections! The purpose is to provide 1.

以下、図面を参照してこの発明の一実施例を説明する。Hereinafter, one embodiment of the present invention will be described with reference to the drawings.

第１図はこの発明の一実施例の概略構成図である。図中
、符号１は光電変換信号示している。この光電変換部１
は帳票上の白黒画像な光電変換信号に変換する機能を持
っている。FIG. 1 is a schematic diagram of an embodiment of the present invention. In the figure, reference numeral 1 indicates a photoelectric conversion signal. This photoelectric conversion section 1
has the function of converting a photoelectric conversion signal into a black and white image on a form.

符号２は画像メモリ部を示している。この画像メモリ部
２は、上記光電変換信号を白ビットと黒ビットの２値に
ｎ子化された状態で少なくとも帳票２行分のデータ全記
憶する機能を持っている。符号３はアドレス制御部全示
している。Reference numeral 2 indicates an image memory section. This image memory section 2 has a function of storing all data for at least two lines of a form in a state in which the photoelectric conversion signal is n-digitized into binary values of white bits and black bits. Reference numeral 3 indicates the entire address control section.

このアドレス制御部３は、画像メモリ部２の読み出しお
よび書き込みのアドレスをコントロールする機能を持っ
ている。符号４は垂直射影部を示している。この垂直射
影部４は画像メモリ部２へ画像の転送中に、垂直方向の
射影をとる機能を持っている。符号５は水平射影部を、
示している。この水平射影部５は、画像メモリ部２へ画
像の転送中に、水平方向の射影をとる機能を持っている
。符号６は認識部を示している。This address control section 3 has a function of controlling read and write addresses of the image memory section 2. Reference numeral 4 indicates a vertical projection section. This vertical projection section 4 has a function of taking a vertical projection while transferring an image to the image memory section 2. Reference numeral 5 indicates the horizontal projection part,
It shows. This horizontal projection section 5 has a function of taking a horizontal projection while transferring an image to the image memory section 2. Reference numeral 6 indicates a recognition section.

この認識部６１ｉ、上記各部の制御を行なう他、画像メ
モリ部２から行および文字の切シ出し全行なって、各文
次の認識を行なう機能を持っている。In addition to controlling the above-mentioned parts, the recognition section 61i also has the function of extracting all lines and characters from the image memory section 2 and recognizing the next part of each sentence.

次に上記実施例の動作を説明する。第２図は帳票の記入
例を示す図で、図中符号１１は、原稿人力用帳票を示し
ており、符号１２は、記入された文字行、符号１３は修
正記入を示す挿入用記号、符号１４は追加挿入する追加
文字行を示している。第３図に示すように、挿入用記号
１３は、該当文字行の垂直射影部４および水平射影部５
においてそれぞれ垂直および水平射影を取った場合、追
加文字行１４と、被追加文字行１２の射影が接続するよ
うな位置、形状で記入されなければならない。また、追
加位置を明示する為に挿入用記号１３を走査して、その
最上点を挿入位置とするので、挿入用記号１３は突き出
しまたは凸部により、追加位置が明瞭にわかるような形
状にする必要がめる。Next, the operation of the above embodiment will be explained. FIG. 2 is a diagram showing an example of filling in a form. In the figure, reference numeral 11 indicates a document for manual input, reference numeral 12 indicates a line of written characters, and reference numeral 13 indicates an insertion symbol indicating a corrected entry. 14 indicates an additional character line to be additionally inserted. As shown in FIG.
When the vertical and horizontal projections are taken respectively in , the additional character line 14 and the added character line 12 must be written in such a position and shape that their projections are connected. In addition, in order to clearly indicate the additional position, the insertion symbol 13 is scanned and the highest point is set as the insertion position, so the insertion symbol 13 should have a protrusion or a convex part so that the addition position can be clearly seen. I see the need.

文字行１２．１４が光電変換部１で光電変換され、皿子
化パターンとして画像メモリ都２に記憶される。このと
き、第３図に示すように、垂直射影部４および水平射影
部５では、上記文５− 手行１２，１４に対する水平射影および垂直射影がとら
れている。（図ではｆ＋線で示されている。）この場合
、文字行１２と１４の間には挿入用記号１３が記入され
ているため、垂直射影部４および水平射影部５には第３
図に示すように、通常の文字ピッチ、行ピッチに対し倍
又はそれ以上の射影が記憶される。Character lines 12 and 14 are photoelectrically converted by the photoelectric converter 1 and stored in the image memory 2 as a plate pattern. At this time, as shown in FIG. 3, the vertical projection section 4 and the horizontal projection section 5 take horizontal projection and vertical projection of the sentence 5-hand lines 12 and 14. (Indicated by the f+ line in the figure.) In this case, since the insertion symbol 13 is written between the character lines 12 and 14, the vertical projection part 4 and the horizontal projection part 5 have a third
As shown in the figure, projections that are twice or more than the normal character pitch and line pitch are stored.

このようにして、所定の行の走査が終了してその射影が
とられると、認識部６は、垂直射影部４および水平射影
部５の内容を走査し、通常の文字ピッチ、行ピッチに対
し、倍又はそれ以上の射影を検出した場合に、認識部６
内に含まれている挿入用記号を検出する検出手段を駆動
し、挿入用記号であるか否かの確認を行なう。In this way, when scanning of a predetermined line is completed and its projection is taken, the recognition unit 6 scans the contents of the vertical projection unit 4 and the horizontal projection unit 5, and calculates the normal character pitch and line pitch. , when a double or more projection is detected, the recognition unit 6
A detection means for detecting an insertion symbol included in the insertion symbol is activated to confirm whether or not it is an insertion symbol.

すなわち、上記検出手段は水平射影部５に記憶された水
平射影の中心付近（行間余白付近）に対応する画像メモ
リ部２の量子化された黒ビツト全追跡して連続している
ことを検出する。この検出によって垂直射影部４に記憶
された射影がゴミや汚れによるものではなく、記入され
た６− 挿入記号によるものであることを確認する。次にこの挿
入記号が１本の一篭舒された連続的な線で構成されＹ方
向に凸部があり、文字行と接触していないことを確認す
る。次に水平射影の中心付近に対応する画像メモリ部２
を水平方向に走査して白ビットから黒ビットに変化する
点がいくつあるか計数する。このとき画像メそり部２を
走査する位置全水平射影の中心付近に対応する位置とし
たが次のようにしてもよい。画像メモリ部２全水平に走
査して黒ビット数を計数してヒストグラムを作成する。That is, the detection means tracks all the quantized black bits of the image memory unit 2 corresponding to the vicinity of the center of the horizontal projection stored in the horizontal projection unit 5 (near the margin between lines) and detects that they are continuous. . This detection confirms that the projection stored in the vertical projection section 4 is not due to dust or dirt, but is due to the written 6- insertion symbol. Next, confirm that this insertion symbol is composed of one continuous line, has a convex part in the Y direction, and does not touch the character line. Next, the image memory section 2 corresponding to the vicinity of the center of the horizontal projection
is scanned in the horizontal direction and the number of points that change from white bits to black bits is counted. At this time, the position at which the image mesori section 2 is scanned is set to be a position corresponding to the vicinity of the center of the total horizontal projection, but the following may be used. A histogram is created by scanning the entire image memory section 2 horizontally and counting the number of black bits.

このヒストグラムの中心付近でヒストグラム量の低い位
置に対応する画像メモリ部２の位置全走査位置とする。The position of the image memory section 2 corresponding to the position near the center of this histogram and where the amount of histogram is low is set as the full scanning position.

前述のようにして計数した白ビットから黒ビットに変化
する点の数の半数が挿入６己号の存在する数である。丑
たこの白ビットから黒ビットに変化する点の座４％　ｋ
記憶しておく。そしてこの点全起点として線縁追跡また
は線縁円追跡全して１つの挿入記号に所属するとみなさ
れる２つの白ビットから黒ピットに変化する点が連続す
る線によって結ばれていること全確認する。Half of the number of points that change from white bits to black bits counted as described above is the number where insertion 6 self-signs exist. 4% k of the point where the ox octopus changes from white bit to black bit
Remember it. Then, use this point as the starting point for line edge tracing or line edge circle tracing. All of the points that change from white bits to black pits, which are considered to belong to one insertion symbol, are all confirmed to be connected by a continuous line. .

このようにして挿入記号１３が確認される。認識部６は
画像メモリ部２内を走査し、挿入用記号１３の中で最も
高い点のＸおよびＹｍ標の位置を検出する。そして、こ
の検出された点の位置座標全追加データの挿入位置とし
て記憶する。In this way, the insertion symbol 13 is confirmed. The recognition unit 6 scans the inside of the image memory unit 2 and detects the position of the highest point X and Ym marks among the insertion symbols 13. Then, the position coordinates of this detected point are stored as the insertion position of all additional data.

また、認識部６は、挿入用記号１３および追加文字行１
４に対してマスクをかけ、文字行１２の量子化パターン
全１文手早位の文字パターンに切シ出し、認識全行ない
、認識結果全認識部６内の図示せぬバッファに順次記憶
させる。同様にして、認識部６は追加文字行１４の量子
化パターンを１文手早位の文字パターンに切シ出し、認
識を行ない、認識結果全上記バッファに順次記憶させる
。そして、認識部６は、すでに記憶されている追加デー
タの挿入位置に基づき、上記認識結果全バッツァ内で並
びかえ、追加文字行の内容を挿入されるべき位置に挿入
きれた形にして記憶させる。The recognition unit 6 also recognizes the insertion symbol 13 and the additional character line 1.
4 is masked, all 1 sentences of the quantized pattern of character line 12 are quickly cut out into character patterns, all recognition is performed, and all recognition results are sequentially stored in a buffer (not shown) in recognition section 6. Similarly, the recognition unit 6 cuts out the quantization pattern of the additional character line 14 into character patterns of one sentence quickly, performs recognition, and sequentially stores all recognition results in the buffer. Then, the recognition unit 6 rearranges the above recognition results in all batzas based on the insertion position of the additional data that has already been stored, and stores the contents of the additional character line in a form that is fully inserted at the position where it should be inserted. .

したがって、上記のようなＯＣＲでは次のような効果を
奏する。Therefore, the above OCR has the following effects.

（１）原稿用紙状の入力帳票に対し、追加挿入用特殊記
号を用いることで容易に追加が行なえる。(1) Additions can be easily made to the input form in the form of a manuscript paper by using special symbols for addition insertion.

（２）ｔ￥ｊ別に追加用の行やフィールドを設けること
々く追加しようとする行の一行下の行を追加用の行とし
て使用できる。(2) Providing an additional row or field for each t\j The row one line below the row to be added can be used as an additional row.

以上述べたようにこの発明によれは、すでに帳票に記入
されたデータに対して、挿入用記号と追加データを記入
することで追加性、追加フィールドを用いることなくデ
ータの修正を行なうことができる光学的文字読取装置を
提供することができる。As described above, according to this invention, data can be corrected without using additionality or additional fields by entering an insertion symbol and additional data into data that has already been entered in a form. An optical character reading device can be provided.

[Brief explanation of drawings]

第１図はこの発明の一実施例の概略構成図、第２図は同
実施例の帳票の記入例を示す図、第３図は同実施例の動
作を説明するための説明図である。１・・・光電変換部、２・・・画像メモリ部、３パ・ア
ドレス制御部、４・・・垂直射影部、５・・・水平射影
９− 一１’ｏ−FIG. 1 is a schematic configuration diagram of an embodiment of the present invention, FIG. 2 is a diagram showing an example of filling out a form in the embodiment, and FIG. 3 is an explanatory diagram for explaining the operation of the embodiment. DESCRIPTION OF SYMBOLS 1... Photoelectric conversion section, 2... Image memory section, 3-pass address control section, 4... Vertical projection section, 5... Horizontal projection 9-11'o-

Claims

[Claims]

a photoelectric conversion unit that photoelectrically converts the characters written on the form into a doji-ized pattern; an image memory unit that stores quantization patterns of at least two lines of characters in the form;
a vertical projection section that takes a vertical projection of the cardinalization pattern stored in the image memory section; a horizontal projection section that takes a horizontal projection of the quantization pattern stored in the image memory section; Based on the stored projection, the insertion symbol written between the character lines is detected, and the recognized character results in the two character lines are rearranged based on the position of the detected insertion signal. An optical character reading device comprising: a recognition section.