JPS5953983A - Detecting and segmenting method of character - Google Patents

Detecting and segmenting method of character

Info

Publication number
JPS5953983A
JPS5953983A JP57164262A JP16426282A JPS5953983A JP S5953983 A JPS5953983 A JP S5953983A JP 57164262 A JP57164262 A JP 57164262A JP 16426282 A JP16426282 A JP 16426282A JP S5953983 A JPS5953983 A JP S5953983A
Authority
JP
Japan
Prior art keywords
character
point
projection
area
adjacent characters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP57164262A
Other languages
Japanese (ja)
Inventor
Masaki Komiya
小宮 雅紀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Toshiba Corp
Original Assignee
Toshiba Corp
Tokyo Shibaura Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Toshiba Corp, Tokyo Shibaura Electric Co Ltd filed Critical Toshiba Corp
Priority to JP57164262A priority Critical patent/JPS5953983A/en
Publication of JPS5953983A publication Critical patent/JPS5953983A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Character Input (AREA)

Abstract

PURPOSE:To set a singular point of a cross-cut as a segmenting point, to make a feature of the singular point clear, and to segment easily adjacent characters, by detecting a contact point of the adjacent characters, from a space between the adjacent characters, and an inspecting area which is set so as to cover its circumference. CONSTITUTION:A character picture pattern P read by a photoelectric converting part is stored temporarily in a picture memory 1, and the pattern P is provided simultaneously to a projection accumulating part 7. Subsequently, character position information provided in advance by a control part 6 is compared with projection information from the accumulating part 7, and whether segmenting can be executed by the projection itself or not is decided. In this case, in principle, a separating point of the projection in an area 12 between character frames 10, 11 is segmented. Also, in case when the separating point does not exist, an inspecting area 13 is set by the control part 6, the area 13 is scanned by the control part 6, and a singular point is derived. Subsequently, a singular point of a cross-cut is set as a segmenting point by use of a character recognizing part 4, a normalizing circuit 3, a picture memory 2, etc., and adjacent characters are segmented easily.

Description

【発明の詳細な説明】 〔発明の技術分野〕 本発明は、光学的文字読取装僅におりる文字の検出切出
し方法、髄に隣り合う文字が伺ら力・の原因により接触
しているような場合に正しく切出しうる切出し方法に関
するものである。
[Detailed Description of the Invention] [Technical Field of the Invention] The present invention provides a method for detecting and cutting out characters that are slightly falling in an optical character reading device, and a method for detecting and cutting out characters that are slightly dropped in an optical character reading device, and a method for detecting and cutting out characters that are slightly falling in an optical character reading device, in which characters that are adjacent to each other are in contact with each other due to a force or force. The present invention relates to a cutting method that can be used to cut out images correctly in such cases.

〔発明の技術的背景とその間断点〕[Technical background of the invention and its discontinuities]

帳票上に記入される文字の中には印字ずれや文字記入枠
カーらはみ出して記入ジズまたことにより、隣り合う文
字同士で接触する場合がある 折角中した場合には正確
な文字認識を行うことができないつそこで、このような
場合に従来では次のような方法で対拠していた。
Some of the characters written on the form may be misaligned, the text may be written outside the text frame, or adjacent characters may come into contact with each other. In cases where this is not possible, conventional methods have been used as follows.

■ 隣接文字間の中心位置で強制的に切出す。■ Forcibly cut out at the center position between adjacent characters.

■ 接触した文字をリジェクトする。■ Reject characters that touch.

■ 射影を求めた結果接触しているとしても、λ次元的
に定食して文字線の縁部を追跡し、切出可能か否かを確
認する。
■ Even if the characters are in contact as a result of calculating the projection, trace the edge of the character line in the λ dimension and check whether it is possible to cut it out.

[有] 隣接文字間の間隙相当部分ある黒ピッ)hのピ
ストグラムを作り、蜂積値の小さな部分で切出す。
[Yes] Create a pistogram of black pixel (h) that has a part corresponding to the gap between adjacent characters, and cut out the part with small honeycomb product value.

以上の方法の中でも■の方法が比較的粘度が良いとはい
うものの、接触の状態によっては切出し位置を駐り、誤
認識するおそれがあった。
Among the above methods, method (2) has a relatively good viscosity, but depending on the state of contact, there is a risk that the cutting position may be incorrectly recognized.

〔発明の目的〕[Purpose of the invention]

そこで、本発明は隣り合う文字の#縁やね端が接触し、
A當の方法では切出しが困難な場合にも正確に切出しす
ることができる切出し方法を提供することを目的とする
Therefore, in the present invention, the # edges and edges of adjacent characters touch,
An object of the present invention is to provide a cutting method that can accurately cut out even when cutting is difficult using the method described above.

〔発明の構成〕[Structure of the invention]

上記目的を達成するために、本発明による文字の検出切
出し方法は、以下に示す処理を行う点に生1徴を有する
・ (aJ  光電変換により得られた文字画像)(ターン
の射影を求める。その射影から隣接する文字画像パター
ン間に接触があるかどうかを確認する。
In order to achieve the above object, the character detection and extraction method according to the present invention has the following characteristics: (aJ character image obtained by photoelectric conversion) (obtains the projection of a turn). From the projection, it is determined whether there is contact between adjacent character image patterns.

(1))  8語の結果、接触がある場合には@接文牢
画像パターン相互間の開腔に相当する部分を覆う検査領
域を設定する。なお、接触がない場合には分離点をもっ
て切出し膚とすればよい1、(0)  設定さgた検査
飴域内に夕1から進入する文字組の数およびその進入位
置を検イj・領域の外周を声査して求める。通常、検氷
9域は方形状に設定する。
(1)) If there is contact as a result of the 8 words, an inspection area is set that covers the part corresponding to the open space between the @closing cell image patterns. In addition, if there is no contact, the skin can be cut out at the separation point. Determine by voice surveying the outer circumference. Normally, the ice detection area 9 is set in a rectangular shape.

(d)  文字画像パターンの前払方向に存在する検査
領域の辺、すなわち文字の配列方向に直交する左右辺な
越を越えて進入する文字線を追跡して、娼該進入文字線
が前記左右の辺に亘ってイと断するか否かを求める。
(d) Trace the character line that enters beyond the edges of the inspection area that exists in the prepayment direction of the character image pattern, that is, the left and right sides perpendicular to the character arrangement direction, and determine whether the character line that enters the left and right sides is Find out whether or not to cut it across the board.

+e)  横断する進入文字線に注目して、そのヌ字糾
中に存在する特異点を求める。求めたも異点なもって切
出し点とする。
+e) Paying attention to the incoming character line that crosses it, find the singular point that exists in the character line. The obtained points are also different points and are used as cutting points.

〔発明の効果〕〔Effect of the invention〕

以上の構成を有する本発明によJ■ば、隣り合う文字間
の間隙(あるいは文字記入枠間)とその周辺を覆って設
定官肚た検査仙域力・らWICり合う文字の接触点をさ
がし出し、檜断貌の特異点をもって切出点とするもので
あるため、特異方の特徴さ之−明確なものであれば確実
に切出しすることカーできるO 〔発明の実施例〕 以下、本発明を図示する実施例に基づいて詳述する。第
1図に本発明による文字の検出切出し方法が適用される
光学的文字読取装置(OCR)の要部を示す。
According to the present invention having the above-mentioned configuration, the contact point of the WIC matching characters can be detected by covering the gap between adjacent characters (or between the character entry frames) and the surrounding area. Since the singular point of the cross section of the cypress is used as the cutting point, if the characteristic of the singular point is clear, it can be cut out without fail. The invention will be described in detail based on illustrative embodiments. FIG. 1 shows the main parts of an optical character reader (OCR) to which a character detection and extraction method according to the present invention is applied.

まず、第1図に示すような手書き文字の場合について鋭
、明する。第1図において、光電変換部(図示せず〕に
て読取られた文字画像パターンPは画像メモリ/に送ら
れ、一時的に格納される。画像メモリは原稿画像の1行
分寸たは1頁分の文字画像パターンを蓄積可能である。
First, the case of handwritten characters as shown in FIG. 1 will be explained in detail. In Fig. 1, a character image pattern P read by a photoelectric conversion unit (not shown) is sent to an image memory and temporarily stored. It is possible to store as many character image patterns as possible.

一方1文字画像パターンPは射影部7にも送られ1文字
画像パターンPの射影が蓄積される。
On the other hand, the one-character image pattern P is also sent to the projection unit 7, and the projection of the one-character image pattern P is accumulated.

次に、制御部6は予め与えられた夕字位置情報と上記求
められた射影情報とを比較し、射影自体で切出しが可能
か否かを判断する。このとき、第1図に示すように、文
字枠10 、71間の領域/コに一方もしくは両方の文
字枠io 、 //から文字の一部が延在している場合
でも轟該伸域/、2内において射影に分離点がある場合
には原則としてその切れ目を切出し点とする。なお、領
域/、!内にノイスがあったシ、射影のブロック複数存
在するような場合の処理については、例えば、ノイズ部
分の面積な予め定めた基準の値と比較することにより判
断するなど、既知の方法で処理する。
Next, the control unit 6 compares the pre-given Yuji position information with the obtained projection information, and determines whether or not the projection itself can be used for extraction. At this time, as shown in FIG. 1, even if a part of the character extends from one or both of the character frames io and // to the area between the character frames 10 and 71, the extended area / , 2, if there is a separation point in the projection, the break is, in principle, the cutout point. Furthermore, the area/,! When there is noise in the image or there are multiple projection blocks, the process is done using a known method, such as by comparing the area of the noise area with a predetermined reference value. .

一方、第2図に示すように、文字枠io、i/間の領域
la内の射影に1つの分離点もない場合(すなわち、I
li!F接文字の細文字扶・角東しているような場合)
には、制御部tは検査領域13を設定する。この検査領
域/3は切出点の判断のための情報を得るためのもので
あり、切出しマーi//を[?すための1、のである。
On the other hand, as shown in FIG.
li! Cases where the F clitic character has a small letter FU/Kakuto)
, the control unit t sets the inspection area 13. This inspection area /3 is for obtaining information for determining the cutting point, and the cutting mark i// is used to obtain information for determining the cutting point. 1.

そして、そのiP Bは文字枠のヤイズ、記入者の菅込
み状態(めろいは、印字の場合は印字精度〕に応じて制
御部6がト定するものであり、−律に決定することはで
きない。しカーし、少tc くとも文字枠10 、 I
10対向辺/y、 、 t!;n6の距離、すなわち伊
域/、7のIP Aよりは太きい。
The iP B is determined by the control unit 6 according to the width of the character frame and the writing condition of the person filling it in (the width is the printing accuracy in the case of printing), and must be determined in a regular manner. I can't do it, and it's a little tc at least character frame 10, I
10 opposite sides/y, , t! ; The distance of n6, i.e., IP area/, is thicker than the IP A of 7.

このように検査領域/、?を設定すると、制往1部6は
検査領域/3の四辺を走査し、又文字枠io 、 ii
の対向する辺/’7 、 /!5を走査し、領域/、2
 、 /3内への文字線の進入する位置とそのしを確認
する。この進入文字線とは、第2図の例では■、■、■
、■部分を才旨す。
Inspection area/, like this? When , the control 1 part 6 scans the four sides of the inspection area /3, and the character frames io and ii
Opposing sides of /'7, /! 5, scan area/, 2
, Check the position where the character line enters into /3 and its end. In the example of Fig. 2, these entry character lines are ■, ■, and ■.
, ■Explain the part.

進入文字線が特定されたら、その進入文字線を走査又は
追跡することにより、当該進入文字線中に存在する特異
点を求める。特異点とは、屈曲点、屈折点なとのことで
るる。第2図の例では■の点が特異点になる。特異点の
求め方としては、の 線熟を追跡しながら一定の微小距
離ごとに線Hの方向をg方向ベクトルの1つに割り当て
てその変化を記憶し、後にその記憶内容から雑音と区別
できる程度の変化があったか否かを判断して文字f、4
縁の船徴を求める一般的な方法、■ 領域/、2の全域
を走査し、その白黒レベルの変什点の移1j+距離を各
走摘紳ごとに徴発することにより求め、微分値(すなわ
ち、移動距離)がブラフあるいはマイナスに変化する点
をもって眉選府と判断する。その他、一般に知られた方
式を用いてもよい。
Once the incoming character line is specified, singular points existing in the incoming character line are determined by scanning or tracing the incoming character line. A singular point is a point of inflection or refraction. In the example of FIG. 2, the point marked ■ becomes the singular point. The way to find the singular point is to track the line ripening, assign the direction of line H to one of the g-direction vectors at certain minute intervals, memorize the change, and later distinguish it from noise from the memorized contents. Judge whether there is a change in degree or not, and then write the letter f, 4.
A general method for finding the edge sign: ■ Scan the entire region /, 2, collect the shift 1j + distance of the change point of the black and white level for each scan, and calculate the differential value (i.e. , travel distance) changes to a bluff or negative value. Other commonly known methods may also be used.

特異点が求められたら、その特異点を切出し点として文
字パターンを切出すべくp隊メモリ/に制御信号を送る
。この制御信号にカーづき、第2図■の点で切出さrま
た/文字(例えは、錆、!図の「ホ」)は1文字分蓄歌
メモリ2に一計譬的に格納される。それ以後は周知の1
・法にて処理さ才しる。
Once a singular point is found, a control signal is sent to the p-team memory/to cut out a character pattern using the singular point as an extraction point. Based on this control signal, the r or / character (e.g., rust, ! in the figure) is cut out at the point (■) in Figure 2, and one character is stored in the accumulator memory 2. . After that, the well-known 1
・It will be dealt with according to the law.

つまり、正規化回路3にてLE規化嘔盪、文字勤1識部
jにて認識される。jは辞昔パターンが格納で才(てい
るメモリである。
That is, the normalization circuit 3 recognizes the LE-standardized vomiting and the character shift 1 recognition section j recognizes it. j is the memory in which the old pattern is stored.

次に、帳票上の文字が活字等の印字文字である場合につ
いて述べる。第3図は又♀のセリフが接触している例を
示している。セリフとは、債に英夕゛牢等の上下の薪端
に杉向きに付りられブこ装す11゛部分のことを指す。
Next, a case where the characters on the form are printed characters such as printed letters will be described. Figure 3 also shows an example where the male lines are touching. Serif refers to the 11" part attached to the upper and lower edges of the bond, facing cedar, to cover it.

このような場合、従来の■のヒストグラムを使う方式だ
と■、■、■のl(分のセリフの切れ目で腔って切り出
すあ・そ11が届い。そこで、本発明の場合は、■や@
なとの夕与樟−が直角に交差する点でセ゛出し、磨削と
して垂1oに−長い線や傾剰紹をyりけた場合(・工そ
11らの袋にその側方を文字線の一部と判断し、それを
ガ〕きた点で切出しをイrうものとする。しかし、第弘
図のように特異点の検出が困難な場合がある。そのよう
な場合は制御部乙に事前に与えらオ′シる情報に従って
リジェクトするか、仮想中心位置にて強制分離を行つこ
ととする。
In such a case, if the conventional method uses the histogram of @
If you draw out a line at the point where two trees intersect at right angles, and make a long line or a slanted line vertically as a grinding process, then draw a letter line on the side of it in the bag of However, as shown in Figure 1, it may be difficult to detect a singular point.In such a case, the control section Either the rejection is performed according to the information provided in advance, or forced separation is performed at the virtual center position.

【図面の簡単な説明】[Brief explanation of the drawing]

第1[ン1(・工本発明による文字切出し方法が逆用さ
れるOO’Hの要部を示すブロック図、第、2図は対象
文字が手■きである場合の例を示す説明図、 第J Piは対象文字が印字でをッる鵠1合の例を示す
説明図1、 第を図は印字の場合で特異虞′検出が困難な例を示す説
明図である、。 P・・・文字画像パターン、l・・・圃像メ千り、コ・
・・1文字分蓄稍メモリ、3・・・正ガ1化回路、≠・
・・文字認識部、!・・・辞Wパターン、6・・・制御
部、7・・・射影V接部、10 、 //・・・文字枠
、/−!・・・9域、/、?・惺督域。
1. A block diagram showing the main part of OO'H in which the character cutting method according to the present invention is reversely applied. Figures 1 and 2 are explanatory diagrams showing an example when the target character is handwritten. Figure 1 is an explanatory diagram showing an example of a case where the target character is printed, and Figure 1 is an explanatory diagram illustrating an example in which it is difficult to detect the abnormality when the target character is printed.・Character image pattern, l...field image, ko・
・・Storage memory for 1 character, 3・・Positive digitization circuit, ≠・
・Character recognition department! ... Word W pattern, 6... Control part, 7... Projection V tangent part, 10, //... Character frame, /-! ...9th area, /,?・Control area.

Claims (1)

【特許請求の範囲】 光電変換により得られた文字画像パターンのうち互に隣
接する文字画像パターン相互の境界を検出して各文字画
像を切出す方法において、前記文字画像パターンの射影
を求めて隣接文字面像パターン間の接触の有無を確認し
、接触がある場合に隣接文字画像パターン相互間の間隙
に相当する部分を覆う検ぞ領域を設定し、検査領域内に
外から進入する文字線の数およびその進入位置を検り領
域の外周を走査することにより求め、 前記検査領域内に進入する文字線を追跡して当該進入文
字線が前記検斉伸域を棉断するか百方を求め、 措置するW1合に焔該積断文字わ′かテ特異点を求め、
この4′¥Iノ点をもってMi出し点とすることを特徴
とする文字の検出切出方法。
[Claims] In a method of extracting each character image by detecting the boundaries between adjacent character image patterns among character image patterns obtained by photoelectric conversion, The presence or absence of contact between character image patterns is checked, and if there is contact, an inspection area is set that covers the part corresponding to the gap between adjacent character image patterns, and character lines entering the inspection area from outside are checked. The number and its entry position are determined by scanning the outer periphery of the inspection area, and the character line entering the inspection area is traced to determine whether the entry character line cuts through the inspection area. , find the singular point of the flame and the product line in the case of W1,
A character detection and cutting method characterized in that the 4'\I point is set as the Mi extraction point.
JP57164262A 1982-09-21 1982-09-21 Detecting and segmenting method of character Pending JPS5953983A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP57164262A JPS5953983A (en) 1982-09-21 1982-09-21 Detecting and segmenting method of character

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP57164262A JPS5953983A (en) 1982-09-21 1982-09-21 Detecting and segmenting method of character

Publications (1)

Publication Number Publication Date
JPS5953983A true JPS5953983A (en) 1984-03-28

Family

ID=15789742

Family Applications (1)

Application Number Title Priority Date Filing Date
JP57164262A Pending JPS5953983A (en) 1982-09-21 1982-09-21 Detecting and segmenting method of character

Country Status (1)

Country Link
JP (1) JPS5953983A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007058304A (en) * 2005-08-22 2007-03-08 Toshiba Corp Character recognition device and character recognition method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS522339A (en) * 1975-06-24 1977-01-10 Nec Corp Unnecessary portion eliminating equipment in pattern identifying equip ment
JPS5561880A (en) * 1978-10-31 1980-05-09 Fujitsu Ltd Character cut out system
JPS5617574A (en) * 1979-07-23 1981-02-19 Nec Corp Noise picture eliminating device
JPS5668871A (en) * 1979-11-09 1981-06-09 Toshiba Corp Character detecting and cutting-out device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS522339A (en) * 1975-06-24 1977-01-10 Nec Corp Unnecessary portion eliminating equipment in pattern identifying equip ment
JPS5561880A (en) * 1978-10-31 1980-05-09 Fujitsu Ltd Character cut out system
JPS5617574A (en) * 1979-07-23 1981-02-19 Nec Corp Noise picture eliminating device
JPS5668871A (en) * 1979-11-09 1981-06-09 Toshiba Corp Character detecting and cutting-out device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007058304A (en) * 2005-08-22 2007-03-08 Toshiba Corp Character recognition device and character recognition method

Similar Documents

Publication Publication Date Title
JP2553608B2 (en) Optical character reader
JPS6077279A (en) Initiation of character image
JPH02306386A (en) Character recognizing device
JPH0743755B2 (en) Character recognition device
JPS5953983A (en) Detecting and segmenting method of character
JP2797848B2 (en) Optical character reader
JPH0548510B2 (en)
JPH05189546A (en) Device for discriminating authenticity of fingerprint featured point
JP3957471B2 (en) Separating string unit
JPH07230525A (en) Method for recognizing ruled line and method for processing table
JP4242962B2 (en) Character extractor
JP2877380B2 (en) Optical character reader
JP3090036B2 (en) Number detection device
JP2590099B2 (en) Character reading method
JP2925270B2 (en) Character reader
JP2832035B2 (en) Character recognition device
JP2963807B2 (en) Postal code frame detector
JPH0467674B2 (en)
JP2659182B2 (en) Character recognition device
JPS58211280A (en) Character reader
JP3391223B2 (en) Character recognition device
CN118781619A (en) Intelligent abnormality detection method and device for answer sheet image based on visual identification
JP2683290B2 (en) Ruled line determination method and character recognition device
CN118781603A (en) Intelligent answer sheet image defect position determining method and device based on visual identification
JP2001038303A (en) Address reader