JPS6330667B2 - - Google Patents

Info

Publication number
JPS6330667B2
JPS6330667B2 JP55187927A JP18792780A JPS6330667B2 JP S6330667 B2 JPS6330667 B2 JP S6330667B2 JP 55187927 A JP55187927 A JP 55187927A JP 18792780 A JP18792780 A JP 18792780A JP S6330667 B2 JPS6330667 B2 JP S6330667B2
Authority
JP
Japan
Prior art keywords
character
projection
area
data
reading
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired
Application number
JP55187927A
Other languages
Japanese (ja)
Other versions
JPS57113182A (en
Inventor
Kazuo Tada
Noriaki Nawate
Hideyuki Mizuta
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to JP55187927A priority Critical patent/JPS57113182A/en
Publication of JPS57113182A publication Critical patent/JPS57113182A/en
Publication of JPS6330667B2 publication Critical patent/JPS6330667B2/ja
Granted legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Character Input (AREA)

Description

【発明の詳細な説明】 本発明は、光学的文字読取り装置における文字
の切出し方法に関する。
DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a method for cutting out characters in an optical character reading device.

従来、光学的文字読取り装置の文字の切出し
は、水平及び垂直の各方向について1回ずつの射
影を行ない、その結果形成される文字の影、即ち
黒ブロツクに基いて文字の認識領域を規定してい
た。しかし、この方法では、射影に際して文字と
重なるノイズ(隣接文字の一部、汚れ等)があつ
た場合には、認識領域がノイズ部分も含む形で切
り出されるために、正確な文字の切出しが不可能
となり、文字の認識処理等の後処理において処理
すべき領域が拡大し、装置全体の処理効率が低下
する欠点があつた。
Conventionally, when cutting out a character in an optical character reading device, one projection is performed in each of the horizontal and vertical directions, and the character recognition area is defined based on the shadow of the character, that is, the black block, formed as a result. was. However, with this method, if there is noise that overlaps the character (part of an adjacent character, dirt, etc.) during projection, the recognition area is cut out including the noise part, making it difficult to accurately cut out the character. However, the area to be processed in post-processing such as character recognition processing is expanded, and the processing efficiency of the entire device is reduced.

そこで、本発明は、切出し領域に対して、垂直
及び水平方向の射影を交互に繰り返すとともに、
前回の射影での射影データー中の最大ブロツクに
ついてのみ次回の射影を行なうことにより、認識
すべき文字周辺のノイズを除外し、文字の外縁に
沿つた形で認識領域を切出して構成し、もつて前
述の欠点を解消した文字の切出し方法を提供する
ことを目的とするものである。
Therefore, the present invention alternately repeats projection in the vertical and horizontal directions on the cutout area, and
By performing the next projection only on the largest block in the projection data from the previous projection, noise around the character to be recognized is excluded, and the recognition area is cut out and constructed along the outer edge of the character. It is an object of the present invention to provide a method for cutting out characters that eliminates the above-mentioned drawbacks.

以下、図面に示す一実施例に基き、本発明を具
体的に説明する。
Hereinafter, the present invention will be specifically explained based on an embodiment shown in the drawings.

光学的文字読取り装置1は、第1図に示すよう
に、読取り部2を有しており、読取り部2には読
取りデーターメモリー3及び制御部5が接続して
いる。メモリー3には切出し処理部6を構成する
水平射影回路7及び垂直射影回路9が接続してお
り、射影回路7,9には水平射影レジスタ10、
垂直射影レジスタ11がそれぞれ接続している。
レジスタ10,11には、射影回路7,9に接続
されたデーター処理部12が接続しており、処理
部12には制御部5が接続している。
As shown in FIG. 1, the optical character reading device 1 has a reading section 2, and a reading data memory 3 and a control section 5 are connected to the reading section 2. The memory 3 is connected to a horizontal projection circuit 7 and a vertical projection circuit 9 that constitute the extraction processing section 6, and the projection circuits 7 and 9 have a horizontal projection register 10,
Vertical projection registers 11 are connected to each other.
A data processing section 12 connected to the projection circuits 7 and 9 is connected to the registers 10 and 11, and a control section 5 is connected to the processing section 12.

光学的文字読取り装置1は、以上のような構成
を有するので、制御部5からの指令で読取り部2
が読取つた文字13は、第2図に示すように、読
取りデーターメモリー3中の水平方向にAビツ
ト、垂直方向にBビツト分の幅を有する切出し領
域3aに1文字ずつ格納される。(一般的に光学
的文字読取り装置における文字書込み用紙という
ものは、枠が形成されており、この枠内に被認識
文字を書込んで、後に読取り部2で枠内を読取る
ことにより切出し領域3aに1文字ずつ格納され
る。一方、枠が形成されていない場合でも機械で
打出された帳票を読むような場合は用紙の大きさ
及び印字される位置は分かつているので、読取り
時における文字の位置もある程度の予測ができ
る。よつて読取り部2では、この予測位置を読取
ることにより切出し領域3aに1文字ずつ格納さ
れる。)次に、領域3aの水平方向のAビツトの
全幅に対して垂直射影回路9により垂直方向の射
影を施し、垂直射影レジスタ11中に、第1の射
影データーDAT1を格納する。データーDAT1
には、領域3a中の全ての画像が射影されること
から、領域3a中に文字13の他に文字13と共
に記録された隣接文字の一部や汚れ等のノイズ1
5,16,17も垂直方向に射影され、その影で
ある黒ブロツク19は、第2図aに示すように、
ノイズ15に対応してx1〜x2桁(1桁は1ビツト
幅に対応する。以下同様。)及び文字13、ノイ
ズ16,17に対応してx3〜x4桁に2個所形成さ
れる。データー処理部12はデーターDAT1か
ら、x3〜x4桁の最大幅の黒ブロツク19を認識す
べき文字13の射影されている領域と判断して、
今度はx3〜x4桁に対応する領域3a中の、幅がC
ビツトなる帯状領域3bに対してのみ、水平方向
に、Bビツト全幅に対して水平射影回路7により
水平方向の射影を施し、水平射影レジスタ10中
に、第2の射影データーDAT2を格納する。デ
ーターDAT2には、第2図cに示すように、ノ
イズ15が除外され、ノイズ16に対応したy1
y2行(1行は1ビツト幅に対応する。以下同様。)
及び文字13、ノイズ17に対応したy3〜y4行の
2個の黒ブロツク19が形成される。処理部12
は、データーDAT2からy3〜y4行の最大幅の黒
ブロツク19を認識すべき文字13の射影されて
いる領域と判断して、今度はy3〜y4行に対応する
幅がDビツトなる帯状領域3c及び、幅がCビツ
トなる帯状領域3bに対してのみ垂直方向に射影
を施し、第3の射影データーDAT3をレジスタ
11中に格納する。すると、ノイズ15,16は
除外され、データーDAT3中には、第2図bに
示すように、文字13に対応したx3〜x5桁及びノ
イズ17に対応したx6〜x7桁の2個の黒ブロツク
19が形成され、処理部12は前述と同様に、x3
〜x5桁の黒ブロツク19を文字13の射影されて
いる領域と判断して、更にx3〜x5桁に対応する幅
がEビツトなる帯状領域3d及び幅がDビツトな
る帯状領域3cに対して水平方向に射影し、第2
図dに示すように、第4の射影データーDAT4
をレジスタ10に格納する。すると、データー
DAT4からはノイズ15,16,17が全て除
外され、文字13に対応したy3〜y5行の黒ブロツ
ク19のみが形成され、制御部5は、認識すべき
文字13が、水平及び垂直方向に幅E、Fビツト
なる帯状領域3d,3fが交差する認識領域3e
に存在していることを知ることができる。なお、
認識領域3eは文字13以外のノイズ15,1
6,17は全て除外され、文字13の外縁に沿つ
た形で正確に切り出されている。
Since the optical character reading device 1 has the above-described configuration, the reading section 2 is controlled by a command from the control section 5.
As shown in FIG. 2, the characters 13 that have been read are stored one by one in a cutout area 3a having a width of A bits in the horizontal direction and B bits in the vertical direction in the read data memory 3. (Typically, the character writing paper used in optical character reading devices has a frame formed therein, and characters to be recognized are written in this frame, and the reading section 2 reads the inside of the frame to cut out the cutout area 3a. On the other hand, even if a frame is not formed, when reading a machine-printed form, the size of the paper and the position where it will be printed are known, so the characters are stored one by one. The position can also be predicted to a certain extent. Therefore, the reading unit 2 stores each character in the extraction area 3a by reading this predicted position.)Next, for the total width of the A bit in the horizontal direction of the area 3a, Vertical projection circuit 9 performs vertical projection, and first projection data DAT1 is stored in vertical projection register 11. Data DAT1
Since all the images in the area 3a are projected, some of the adjacent characters recorded together with the character 13 and noise 1 such as dirt in addition to the character 13 in the area 3a are projected.
5, 16, and 17 are also projected in the vertical direction, and their shadow, the black block 19, is as shown in Figure 2a.
Corresponding to noise 15, two digits are formed in x 1 to x 2 (one digit corresponds to 1 bit width. The same applies hereinafter), and in correspondence to character 13 and noises 16 and 17, two places are formed in digits x 3 to x 4 . Ru. The data processing unit 12 determines from the data DAT 1 that the black block 19 with the maximum width of x 3 to x 4 digits is the area on which the character 13 to be recognized is projected.
This time, the width is C in area 3a corresponding to x 3 to x 4 digits.
A horizontal projection circuit 7 applies horizontal projection to the entire width of B bits only to the strip area 3b, which is a bit, and stores second projection data DAT2 in the horizontal projection register 10. As shown in FIG. 2c, data DAT2 has noise 15 excluded and y 1 to y corresponding to noise 16.
y 2 lines (1 line corresponds to 1 bit width. The same applies hereafter.)
Two black blocks 19 of y 3 to y 4 lines corresponding to the characters 13 and noise 17 are formed. Processing section 12
determines that the maximum width black block 19 of lines y 3 to y 4 from data DAT2 is the projected area of the character 13 to be recognized, and this time the width corresponding to lines y 3 to y 4 is D bits. The third projection data DAT3 is stored in the register 11 by vertically projecting only the strip-shaped region 3c having a width of C bits and the strip-shaped region 3b having a width of C bits. Then, noises 15 and 16 are excluded, and data DAT3 contains 2 digits x 3 to x 5 corresponding to character 13 and digits x 6 to x 7 corresponding to noise 17, as shown in Figure 2b. x 3 black blocks 19 are formed, and the processing section 12 processes x 3 black blocks 19 as described above.
The black block 19 of ~x 5 digits is determined to be the area where the character 13 is projected, and then a strip area 3d with a width of E bits and a strip area 3c with a width of D bits corresponding to x 3 ~ x 5 digits are formed. The second
As shown in Figure d, the fourth projection data DAT4
is stored in register 10. Then the data
All the noises 15, 16, and 17 are removed from the DAT 4, and only the black blocks 19 of y 3 to y 5 lines corresponding to the character 13 are formed. A recognition area 3e where strip areas 3d and 3f with widths E and F bits intersect with each other.
You can know that it exists. In addition,
Recognition area 3e is noise 15, 1 other than character 13
6 and 17 are all excluded and are accurately cut out along the outer edge of the character 13.

以上説明したように、本発明によれば、文字1
3の格納されている切出し領域3aに対して、垂
直及び水平方向の射影を交互に繰り返すことによ
り、文字13周辺のノイズ15,16及び17等
を除外し、文字13の外縁に沿つた形で認識領域
3eを切り出すようにしたので、正確な文字13
の切出しが可能となり、認識処理等の後処理にお
いて処理すべき領域を適正な範囲に設定すること
ができ、装置全体の処理効率を大幅に向上させる
ことが可能となる。
As explained above, according to the present invention, character 1
By alternately repeating vertical and horizontal projections on the cutout area 3a where 3 is stored, noises 15, 16, 17, etc. around the character 13 are removed, and the image is projected along the outer edge of the character 13. Since the recognition area 3e is cut out, the correct character 13
It becomes possible to cut out the area, set the area to be processed in post-processing such as recognition processing to an appropriate range, and significantly improve the processing efficiency of the entire apparatus.

【図面の簡単な説明】[Brief explanation of the drawing]

第1図は本発明が適用される光学的文字読取り
装置の一例を示すブロツク図、第2図は第1図の
読取りデーターメモリーから文字を切出す際の説
明図である。 1……光学的文字読取り装置、2……読取り
部、3……読取りデーターメモリー、3a……切
出し領域、3e……認識領域、13……文字、1
5,16,17……ノイズ。
FIG. 1 is a block diagram showing an example of an optical character reading device to which the present invention is applied, and FIG. 2 is an explanatory diagram of cutting out characters from the read data memory of FIG. 1. DESCRIPTION OF SYMBOLS 1... Optical character reading device, 2... Reading unit, 3... Reading data memory, 3a... Cutting area, 3e... Recognition area, 13... Character, 1
5, 16, 17...Noise.

Claims (1)

【特許請求の範囲】[Claims] 1 読取り部から読取つた文字を読取りデーター
メモリーの切出し領域中に1文字ずつ格納する光
学的文字読取り装置において、前記切出し領域に
対して、垂直及び水平方向の射影を交互に繰り返
すとともに、前回の射影での射影データー中の最
大ブロツクについてのみ次回の射影を行なうこと
により、認識すべき文字周辺のノイズを除外し、
文字の外縁に沿つた形で認識領域を切出す事を特
徴とする文字の切出し方法。
1. In an optical character reading device that reads characters read from a reading unit and stores them one character at a time in a cutout area of a data memory, projections in the vertical and horizontal directions are alternately repeated on the cutout area, and the previous projection is By performing the next projection only on the largest block in the projection data, noise around the characters to be recognized can be excluded,
A character cutting method characterized by cutting out a recognition area along the outer edge of a character.
JP55187927A 1980-12-29 1980-12-29 Segmenting method for character Granted JPS57113182A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP55187927A JPS57113182A (en) 1980-12-29 1980-12-29 Segmenting method for character

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP55187927A JPS57113182A (en) 1980-12-29 1980-12-29 Segmenting method for character

Publications (2)

Publication Number Publication Date
JPS57113182A JPS57113182A (en) 1982-07-14
JPS6330667B2 true JPS6330667B2 (en) 1988-06-20

Family

ID=16214629

Family Applications (1)

Application Number Title Priority Date Filing Date
JP55187927A Granted JPS57113182A (en) 1980-12-29 1980-12-29 Segmenting method for character

Country Status (1)

Country Link
JP (1) JPS57113182A (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS615385A (en) * 1984-06-18 1986-01-11 Omron Tateisi Electronics Co Two-dimensional visual recognizer
JPS61100876A (en) * 1984-10-23 1986-05-19 Omron Tateisi Electronics Co Method of segmenting graphic in graphic recognizing device
JPH0799530B2 (en) * 1985-10-02 1995-10-25 住友電気工業株式会社 Optical character reader
JPS62219639A (en) * 1986-03-20 1987-09-26 Sony Corp Manufacture of semiconductor device
JP2597567B2 (en) * 1987-02-27 1997-04-09 株式会社東芝 Image buffer character pattern extraction method
JP2531800B2 (en) * 1989-09-12 1996-09-04 富士電機株式会社 Character cutting method in character reading device

Also Published As

Publication number Publication date
JPS57113182A (en) 1982-07-14

Similar Documents

Publication Publication Date Title
US4566128A (en) Method for data compression for two-value picture image
US4408342A (en) Method for recognizing a machine encoded character
US4466123A (en) Apparatus and method for correcting contour line pattern images
JPS6330667B2 (en)
KR930001098A (en) Mark writing and cancellation method and mark recognition device
US4185271A (en) Character reading system
JP2868392B2 (en) Handwritten symbol recognition device
JP2822792B2 (en) Image noise removal device
JP3067474B2 (en) Image processing device
JPH04311283A (en) Line direction discriminating device
JPH0373916B2 (en)
JPS61260290A (en) Dot pattern expansion processing system
JP2853510B2 (en) Image noise removal device
JP2972011B2 (en) Character recognition device
JPS6334932Y2 (en)
JPS5997187A (en) Space character display system
JP2740539B2 (en) Enlarged reproduction image information creation method and apparatus
CN113392838A (en) Character segmentation method and device and character recognition method and device
JPS594068B2 (en) Character detection cutting device
JPH0535872A (en) Contour tracing system for binary image
JPH0398185A (en) Character segmenting method for character reader
JPH07325878A (en) Recognizing method for character
JPS5838831B2 (en) Connection line extraction processing method
JPS5816666B2 (en) Redundancy reduction coding transmission method
JPH0367030B2 (en)

Legal Events

Date Code Title Description
LAPS Cancellation because of no payment of annual fees