JPS58105385A - Character reading and recognizing device - Google Patents

Character reading and recognizing device

Info

Publication number
JPS58105385A
JPS58105385A JP56202942A JP20294281A JPS58105385A JP S58105385 A JPS58105385 A JP S58105385A JP 56202942 A JP56202942 A JP 56202942A JP 20294281 A JP20294281 A JP 20294281A JP S58105385 A JPS58105385 A JP S58105385A
Authority
JP
Japan
Prior art keywords
overlap
row
characters
projection component
circuit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP56202942A
Other languages
Japanese (ja)
Inventor
Mamoru Maeda
護 前田
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ricoh Co Ltd
Original Assignee
Ricoh Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ricoh Co Ltd filed Critical Ricoh Co Ltd
Priority to JP56202942A priority Critical patent/JPS58105385A/en
Publication of JPS58105385A publication Critical patent/JPS58105385A/en
Pending legal-status Critical Current

Links

Abstract

PURPOSE:To recognize the sentences belonging to the same character row, by deciding the same row if the overlap exceeds a certain value between the projection component of an independent pattern in a specific direction on an original and the projection component of an adjacent independent pattern in the specific direction. CONSTITUTION:When the characters are read and recognized to decide the sentence row to which the characters belong, an independent pattern extracting circuit 50 searches an independent patterns S3 while moving a segmenting window 50 which is larger enough than characters 51-54. A projection component inspecting circuit 43 extracts out of the signal S3 the horizontal and vertical projection components with the longitudinal writing and the lateral writing respectively. Then the circuit 43 inspects the overlap between the projection component of the immediately preceding (adjacent) characters stored in a memory and the projection component of the present characters. Then the same row is decided when the overlap is larger than the prescribed value. While a different row is decided when the overlap is smaller than the prescribed value. Then prescribed codes are given to both the same row and the different row respectively.

Description

【発明の詳細な説明】 発明の技術公費 この発明は文字読取)11識装置Kid、特にその文字
の属する文章性を決定するための装置に関する。
DETAILED DESCRIPTION OF THE INVENTION Technical Field of the Invention This invention relates to a character reading/identifying device (Kid), and particularly to a device for determining the text nature to which the character belongs.

従来技術 光学式の文字読取シ装置において、文章性の読取シを容
易にすべく、第1図で示す様に、用紙10上に行を示す
!−り11を設け、このマーク11tCよ91行の文章
12の含まれる範囲を規制するものがあった。
In conventional optical character reading devices, lines are shown on a sheet of paper 10, as shown in FIG. 1, in order to facilitate the reading of text. - 11 was provided to restrict the range in which the 91 lines of text 12 were included from this mark 11tC.

しかしながら、この様な方法によれば、用紙にわざわざ
マークを設けなければならず、またマークに従った記入
をせねばならず煩雑であり、特に文章性が大きく傾いた
夢合には行の抽出に失敗する。すなわち、第2図に示す
様に、文字が横方向に整列して印刷された原稿において
も、原稿を斜めにセットするなどして読取シ時に傾くと
画像データが傾き、行間が狭い場合にはWJS図に示す
様な水平射影成分1 &+ 2 &s 3 &+ ・・
・、 101a、 102m+103m 、・・・ を
求めてもその切れ目を検出できず、従って同一行か否か
の判断が不可能となる。
However, according to this method, it is necessary to make marks on the paper, and it is complicated to fill in the information according to the marks. fail. In other words, as shown in Figure 2, even if the original is printed with characters aligned horizontally, if the original is set diagonally and tilted during scanning, the image data will be skewed, and if the line spacing is narrow, Horizontal projected components 1 &+ 2 &s 3 &+ as shown in the WJS diagram.
. , 101a, 102m+103m, . . . , the break cannot be detected, and therefore it is impossible to determine whether or not they are on the same line.

発明の目的 この発明は、以上の様な実情に基いて成されたものであ
シ、フリーフォーマットの文章画像においても行抽出の
可能な文字読取り認識装置を提供することを目的とする
Purpose of the Invention The present invention was made based on the above-mentioned circumstances, and an object of the present invention is to provide a character reading recognition device capable of extracting lines even from free format text images.

この目的を達成するため、この発明によれば、原稿から
独立パターンを抽出する第1の装置と、この第1の装置
によって得られる独立パターンの特定方向射影成分を検
出する第2の装置と、隣接する独立パターンの前記特定
方向射影成分の重なりを検定しその重なりが一定値以上
であれば同一行と判断しまた一定値以下の場合には別行
と判断し各々所定のコードを付与する第3の装置とを具
えるようにする。
To achieve this object, the present invention includes: a first device for extracting an independent pattern from a document; a second device for detecting a specific direction projection component of the independent pattern obtained by the first device; The overlap of the specific direction projection components of adjacent independent patterns is verified, and if the overlap is above a certain value, it is judged as the same line, and if it is below a certain value, it is judged as separate lines, and a predetermined code is assigned to each of them. 3.

発明の実施例 以下、添付図面に従ってこの発明の詳細な説明する。Examples of the invention Hereinafter, the present invention will be described in detail with reference to the accompanying drawings.

第4図はこの発明の実施例を示す系統図であり、文字1
!堰装置栃、孤立点除去回路41.、連結パターン抽出
回路社、射影成分検定回路43、行抽出回路44、後処
理回路6、及び認識装置栃を示している。
FIG. 4 is a system diagram showing an embodiment of this invention, and character 1
! Weir device Tochi, isolated point removal circuit 41. , a connected pattern extraction circuit company, a projection component verification circuit 43, a row extraction circuit 44, a post-processing circuit 6, and a recognition device Tochi.

文字読取装置菊は、例えば光学式の従来公知の各種読取
装置であり、光学偉を画素単位の画儂データS1に変換
する。
The character reading device is, for example, one of various conventionally known optical reading devices, and converts optical characters into image data S1 in units of pixels.

孤立点除去回路41は、文章とは関係の無いノイズ等を
除去するためのものであり、各種の回路が知られている
。出力S2はノイズを除去した画儂信号である。
The isolated point removal circuit 41 is for removing noise and the like unrelated to the text, and various circuits are known. The output S2 is a picture signal from which noise has been removed.

独立パターン抽出回路42は、第5図に示す様に、文字
51〜54よシ充分大きい切出し室団を移動しながら各
独立のノリーンを探索する。パターンを構成する信号が
81である。
As shown in FIG. 5, the independent pattern extraction circuit 42 searches for each independent noreen while moving through a group of cutting chambers that are sufficiently large than the characters 51 to 54. The signal 81 constitutes the pattern.

第5図に示す独立パターンの内、文字T (51) 。Among the independent patterns shown in FIG. 5, the letter T (51).

m(52)は既に抽出されて処理済コードが与えられて
おシ、文字c (53)は現在検出された独立パターン
であり、また文字H(34)は窒50に付着するパター
ンとする。
It is assumed that m (52) has already been extracted and given a processed code, character c (53) is the currently detected independent pattern, and character H (34) is the pattern that adheres to nitrogen 50.

射影成分検定回路6は、独立パターン抽出回路社の出力
である独立パターンを現わす信号88を基に、縦書文書
の場合は水平方向射影成分を抽出し、横書文書の場合は
艦直方向射影成分を抽出する。
The projected component verification circuit 6 extracts the horizontal projected component in the case of a vertically written document, and extracts the horizontally projected component in the case of a horizontally written document, based on the signal 88 representing the independent pattern output from the independent pattern extraction circuit. Extract the projected components.

また、この回路招はメモリ、比較器、及びコード設定器
(いずれも図示せず)を有し、メモリに蓄積した直前の
文字の射影成分と現在の文字の射影成分との重なシを検
定し、この重なシを比較器の予め定められた閾値と比較
する。こうして、重なりが閾値より大きいときは同一行
であると判断し、また閾値よシ小さければ別行と判断し
各々所定のコードを付与する。
This circuit also includes a memory, a comparator, and a code setter (none of which are shown), and verifies the overlap between the projected component of the immediately previous character stored in the memory and the projected component of the current character. Then, this overlapping value is compared with a predetermined threshold value of a comparator. In this way, if the overlap is greater than the threshold value, it is determined that they are the same line, and if the overlap is smaller than the threshold value, it is determined that they are different lines, and a predetermined code is assigned to each line.

行抽出回路必は、射影成分検定回路43によって付与さ
れたコードのうち同一行に属するコードを有する独立パ
ターンを構成する信号84を順次読出す。
The row extraction circuit sequentially reads signals 84 constituting an independent pattern having codes belonging to the same row among the codes given by the projection component verification circuit 43.

後処理回路6は、こうした信号S4を基に同一行に属、
する独立パターンを順次形成するものであり、この様な
独立パターンを形成する信号S6により認識装置46に
所定文章を表示する。認識装置弱は公知の各種の表示装
置又はプリンタ等である。
Based on the signal S4, the post-processing circuit 6 determines whether it belongs to the same row or not.
A predetermined sentence is displayed on the recognition device 46 using a signal S6 that forms such independent patterns. The recognition device includes various known display devices, printers, and the like.

以上の様な構成とすることKよシ、文字読取装置駒によ
って読取られた画像情@81は孤立点除去回路41によ
って不要な信号を除去され独立パターン抽出回路42に
よって独立パターン8.が抽出される。このパターンS
、から射影成分検定回路43によってパターンの射影成
分を取出し瞬接する独立パターンの射影成分との間で相
互の重なシを検定し、同一行に属するか、否かを判断し
所定のコードを付与する0行抽出回路躬及び後処理回路
仙によって前記コードの分類読出しを実行し、同一文字
行に属する文章等をl?!!識装置46を介して出力す
る。
With the above configuration, image information @81 read by the character reading device piece is subjected to unnecessary signal removal by the isolated point removal circuit 41, and independent pattern extraction circuit 42 extracts the independent pattern 8. is extracted. This pattern S
, the projected component of the pattern is extracted by the projected component testing circuit 43 and the projected components of the instantaneously touching independent patterns are tested for mutual overlap, it is determined whether they belong to the same row or not, and a predetermined code is assigned. The 0-line extraction circuit and the post-processing circuit perform classification reading of the code, and identify sentences belonging to the same character line. ! ! output via the recognition device 46.

発明の効果 この発明紘、以上の様に構成することによシ、文章性が
傾いている場合特にフリーフォーマットの場合にも文章
性の読取り開繊を正確に実行する仁とができる文字読取
シレ識装置を提供することができる。
Effects of the Invention By configuring as described above, this invention provides a character reading system that can accurately read and open the text even when the text is biased, especially in the case of a free format. identification device can be provided.

【図面の簡単な説明】[Brief explanation of the drawing]

第1図乃至第3図は従来の行読取シの方法を説明する図
、第4図はこの発明の実施例を示す系統図、第5図は第
4図における射影成分検定回路43の動作を説明するた
めの図である。 40・・・文字読取装置、41・・・孤立点除去回路、
42・・・独立パターン抽出回路、43・・・射影成分
検定回路、。 必・・・行抽出回路、45・・・後処理回路、弱・・・
開織装置1、出願人代理人  猪 股  清
1 to 3 are diagrams explaining the conventional line reading method, FIG. 4 is a system diagram showing an embodiment of the present invention, and FIG. 5 shows the operation of the projected component test circuit 43 in FIG. 4. It is a figure for explaining. 40...Character reading device, 41...Isolated point removal circuit,
42... Independent pattern extraction circuit, 43... Projective component testing circuit. Necessary...Line extraction circuit, 45...Post-processing circuit, Weak...
Weaving device 1, applicant's representative Kiyoshi Inomata

Claims (1)

【特許請求の範囲】[Claims] 原稿から独立パターンを抽出する第1の装置と、この第
1の装置によって得られる独立パターンの特定方向射影
成分を、検出する第2の装置と、隣接する独立パターン
の前記特定方向射影成分の重なり門検定しその重なシが
一定値以上であれば同一行と判断しまた一定値以下の場
合には別行と判断し各々所定のコードを付与する第3の
装置とを具えて成る文字読取シ認識装置。
a first device that extracts an independent pattern from a document; a second device that detects a specific direction projected component of the independent pattern obtained by the first device; and an overlap of the specific direction projected components of adjacent independent patterns. A character reading device comprising: a third device which performs a gate test, determines that the lines are the same line if the overlap is above a certain value, and judges that they are separate lines if the overlap is below a certain value, and assigns a predetermined code to each line. Shi recognition device.
JP56202942A 1981-12-16 1981-12-16 Character reading and recognizing device Pending JPS58105385A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP56202942A JPS58105385A (en) 1981-12-16 1981-12-16 Character reading and recognizing device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP56202942A JPS58105385A (en) 1981-12-16 1981-12-16 Character reading and recognizing device

Publications (1)

Publication Number Publication Date
JPS58105385A true JPS58105385A (en) 1983-06-23

Family

ID=16465707

Family Applications (1)

Application Number Title Priority Date Filing Date
JP56202942A Pending JPS58105385A (en) 1981-12-16 1981-12-16 Character reading and recognizing device

Country Status (1)

Country Link
JP (1) JPS58105385A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0681414U (en) * 1993-05-15 1994-11-22 株式会社斎藤器物製作所 Induction cooker pot

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0681414U (en) * 1993-05-15 1994-11-22 株式会社斎藤器物製作所 Induction cooker pot

Similar Documents

Publication Publication Date Title
US20130129219A1 (en) Pattern recognition apparatus, pattern recogntion method, image processing apparatus, and image processing method
JP2001256505A (en) Recognition device recognition method, paper sheet processor and paper sheet processing method
JPS58105385A (en) Character reading and recognizing device
JP4300083B2 (en) Form reader
JP3268552B2 (en) Area extraction method, destination area extraction method, destination area extraction apparatus, and image processing apparatus
JPH07230525A (en) Method for recognizing ruled line and method for processing table
JPH10154191A (en) Business form identification method and device, and medium recording business form identification program
JPH03122786A (en) Optical character reader
EP1237115B1 (en) Automatic table location in documents
JP2978801B2 (en) Character input method for handwritten character recognition
JP2925270B2 (en) Character reader
JPH0433075B2 (en)
JPS6389990A (en) Character reading system
JPS6252687A (en) Character detecting and segmenting system for character reader
JP3112190B2 (en) How to set the recognition target area
JPH05282487A (en) Character recognizing device
JP2001043372A (en) Character checking device
JPH0264882A (en) Address reading device
JPH10134145A (en) Character segmenting method, character recognition device using the same, and computer-readable storage medium where program implementing the same character segmenting method is stored
JPH10124610A (en) Optical character reading device
JPH03164885A (en) Optical character reader
JPH03296884A (en) Device for extracting character image
JPH06301814A (en) Character reader
JPS63136181A (en) Character reader
JP2005242825A (en) Business form reading device and business form direction determination method by business form reading device