JPS58165178A - Character reader - Google Patents

Character reader

Info

Publication number
JPS58165178A
JPS58165178A JP57046749A JP4674982A JPS58165178A JP S58165178 A JPS58165178 A JP S58165178A JP 57046749 A JP57046749 A JP 57046749A JP 4674982 A JP4674982 A JP 4674982A JP S58165178 A JPS58165178 A JP S58165178A
Authority
JP
Japan
Prior art keywords
character
character pattern
circumscribed
line
length
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP57046749A
Other languages
Japanese (ja)
Other versions
JPH031712B2 (en
Inventor
Kunio Sakai
坂井 邦夫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Toshiba Corp
Original Assignee
Toshiba Corp
Tokyo Shibaura Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Toshiba Corp, Tokyo Shibaura Electric Co Ltd filed Critical Toshiba Corp
Priority to JP57046749A priority Critical patent/JPS58165178A/en
Publication of JPS58165178A publication Critical patent/JPS58165178A/en
Publication of JPH031712B2 publication Critical patent/JPH031712B2/ja
Granted legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/18Extraction of features or characteristics of the image
    • G06V30/182Extraction of features or characteristics of the image by coding the contour of the pattern
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Character Discrimination (AREA)

Abstract

PURPOSE:To extract the quantity of an average ruggedness of a character pattern as a feature, by obtaining the length of the border line between a character line part and a background part for every circumscribed side in the peripheral part of the character pattern and obtaining the quantity of the average ruggedness of the character pattern on a basis of the relative value of this length of the border line to the length of a circumscribed frame. CONSTITUTION:The circumscribed frame surrounding the character pattern is obtained by detecting a left circumscribed side L which is brought into contact with leftmost end, a right circumscribed side R which is brought into contact with the rightmost end, an upper circumscribed side U which is brought into contact with the uppermost end, and a lower circumscribed side D which is brought into contact with the lowermost end. Circumscribed sides U, D, L, and R are used as reference positions to scan the character pattern from respective circumscribed sides U, D, L, and R toward the center of the character pattern, and length of scanning line to character line is obtained. The total sum of differences between these scanning lines is calculated for each of circumscribed sides U, D, L, and R, and the calculated value is detected as the length of the border line between the background part and the character line part of the character pattern. Thus, feature required for every rough classification of original character pattern of hand-written character is extracted stably and efficiently.

Description

【発明の詳細な説明】 〔発明の技術分−〕 本発明は漢字を含む多数の文字を読取り対象として、安
定に且つ効率良く大分類識別の特徴を抽出することので
きる実用性の高い文字読取り装置に関する。
[Detailed Description of the Invention] [Technical Portion of the Invention] The present invention provides a highly practical character reader that can stably and efficiently extract features for broad classification identification from a large number of characters including kanji. Regarding equipment.

〔発明の技術的背゛章〕[Technical background of the invention]

入力文字パターンの%像を抽出し、辞書に予め登−され
ている辞書パターンの特徴との照合を行い、その照合結
果に従って上dヒ入方文字パターンを読取りli敵する
と七が行われている。
Extract the image of the input character pattern, match it with the features of the dictionary pattern registered in advance in the dictionary, and read the input character pattern according to the result of the match. .

ところがこの文字iinにおいて漢字を含む多数の文字
を読取り対象とする場合、上記特徴の照合処理が非常に
一犬な量となり、処理効率が着しく悪くなる。そヒで従
来では、上記認識処理に先立って認識対象文字を複数の
概略的な特命によって大分類し、これによって識別段階
での照合文字数を低減して処理効率の向上、処理速度の
高速化を図ることが行われている。この際、上記大分類
に用いる特徴と候補文字識別に用いる4I徴とに同じも
のを用いることが必要であるが、この4111は手書文
字等に起因する文字、パターンの変形や種々の雑音に対
して十分安定であることが必要である。
However, when a large number of characters, including Chinese characters, are to be read in the character iin, the amount of processing required to match the above-mentioned characteristics becomes extremely large, and the processing efficiency deteriorates considerably. Conventionally, prior to the above recognition process, the characters to be recognized are roughly classified according to multiple general special instructions, which reduces the number of characters to be matched at the identification stage, improving processing efficiency and speeding up processing speed. There are many things being done. At this time, it is necessary to use the same features for the above-mentioned major classification and the 4I features used for candidate character identification, but this 4111 is effective against deformation of characters and patterns caused by handwritten characters, etc., and various noises. It is necessary to be sufficiently stable against the

しかして、従来、このような文字パターンの大分類に用
いられる特徴としては、文字線の複雑さに着目したもの
や、文字周辺部の形状に着目したもの、更には文字全体
の粗い形状に着目したもの等がある・同、これらについ
ては、例えば下記の文献に詳しく紹介されている。
However, conventionally, the features used to broadly classify character patterns have focused on the complexity of the character lines, the shape of the peripheral part of the character, and even the rough shape of the entire character. For example, the following literature introduces these in detail.

坂井、渡辺 1印刷漢字認識の現状l。Sakai, Watanabe 1 Current status of printed kanji recognition l.

情報処理、Vol、22 Ma4 P・72’14〜2
79(昭和56年4月) 第1図はその一例を示す員で、原文字パターンの周囲の
形状に着目し、その周辺部上下左右に矩形状に設けた走
査領域U、D、L、R内に存在する文字線を検出し、そ
の量を「0」。
Information Processing, Vol, 22 Ma4 P.72'14-2
79 (April 1982) Figure 1 shows an example of this. Focusing on the shape around the original character pattern, scanning areas U, D, L, and R are provided in rectangular shapes on the top, bottom, left, and right of the periphery. Detects the character line that exists within and sets its amount to "0".

rlJ 、r2Jなる3段階のレベルに量子化してこれ
を特徴データとするものである。才た第2図に示すもの
は、原文字パターンに接する外接辺u、d、j、rに着
目し、これらの外接辺u、d、I、rに接する文字背景
部の面積を求め、これを多次元の特徴ベクトルとして利
用するものである。前者(第1図)は文字線部に着目す
るのに対し、後者(第2図)は背景部面積に着目する点
を異にしているが、いずれも文字パターンの周囲におけ
る面積的数量を特徴としていると云える。
It is quantized into three levels rlJ and r2J and used as feature data. What is shown in Fig. 2 focuses on the circumscribed sides u, d, j, and r that touch the original character pattern, calculates the area of the character background that touches these circumscribed sides u, d, I, and r, and calculates this by is used as a multidimensional feature vector. The difference is that the former (Figure 1) focuses on the character line area, while the latter (Figure 2) focuses on the area of the background area, but both are characterized by the area quantity around the character pattern. It can be said that this is true.

〔背景技術の関電点〕[Kanden point of background technology]

きころが、上、記4I黴抽出は、活字文字等のようにそ
のパターンが規格されている場合には非常lこ有効であ
る11.:が、手書文字のように文字としての特徴を有
−ゴ4からも大きく変形しているような場合にはiめで
不安定である。しかも、例え活字文字であっても、文字
パターンに欠けやかすれが存在する場合、その特徴抽出
は基だ不安定なものとなる。換言すれば文字パターンの
11や種々の雑音に対して常に安定に%像抽出を行い得
ない走云う間龜を有していや。
11. The mold extraction mentioned above in 4I is very effective when the pattern is standardized, such as in printed characters. : is unstable in the case of characters such as handwritten characters whose characteristics as characters are significantly deformed from the original characters. Moreover, even if the characters are printed, if the character pattern is chipped or faded, feature extraction becomes unstable. In other words, there is a running time in which it is not always possible to perform stable image extraction against character patterns and various types of noise.

〔発明の目的〕[Purpose of the invention]

本発明はこのような事情を考慮してなされたもので、そ
の目的きするきζろは、手書きされた漢字を含む多数の
文字を読取り対象として、人力された原文字パターイの
大分llm−別番こ必要な特徴を安定に且つ効率良く抽
出することのできる実用性の高い文字読取り装置を提供
することにある。
The present invention has been made in consideration of these circumstances, and its purpose is to read a large number of characters, including handwritten kanji, by reading the original character pattern written by hand. It is an object of the present invention to provide a highly practical character reading device that can stably and efficiently extract necessary features.

〔発明の概要〕[Summary of the invention]

本発明は文字の大分類識別に必要な分類情報(41徴)
が文字パターンの周、四部に多く存在すれた原文字パタ
ーンの各外接辺にそれ、ぐれ対応して求め、これらの境
界線の長さの外接枠の長さに対する相対値から文字パタ
ニンの平均的凹凸量を求めてこれを大分類識別の特徴2
して抽出するようにした文字読取り装置に係わる。
The present invention provides classification information (41 characters) necessary for character classification identification.
The average of the character pattern is calculated from the relative value of the length of these border lines with respect to the length of the circumscribed frame. Find the amount of unevenness and use this as major classification feature 2
The present invention relates to a character reading device that extracts characters.

〔発明の効果〕〔Effect of the invention〕

本発明によれば、入力された文字パターンに接する外接
辺によって鈍才れる上記文字パターンの各外接辺に対応
した文字線部と背景部の境界ls−と云う新規な概念、
すなわち、文字周囲部の凹凸度合、言いかえれば形状の
複雑さを導入し、し□かもこれらの境界線長の外接枠に
対する相対的な大きさを特徴とするので、文字パターン
の大きさの変化や位置のずれなどの種々の雑音に対して
十分安定にその特徴情報を得、入カバターンの大分類識
別を安定且つ効率良く行なうことかできるので、文字w
A識処理の著しい向上を図り得る。また漢字等の複雑な
文字パターンであっても、また手書文字を対象とする場
合であってもさらには活字文字を対象とずぶ場合であっ
て社主記特徴は以下に述べる簡単な方法によって容易に
得ることができるので文字認識処理において実画性の高
い顕著な効果i奏し得る。
According to the present invention, a novel concept of a boundary ls- between a character line portion and a background portion corresponding to each circumscribed side of the character pattern, which is slowed down by the circumscribed side touching the input character pattern,
In other words, the degree of unevenness around the character, in other words, the complexity of the shape, is introduced, and the relative size of these boundary line lengths with respect to the circumscribed frame is characterized, so the change in the size of the character pattern Characteristic information can be obtained sufficiently stably against various noises such as errors and positional deviations, and the broad classification of input patterns can be stably and efficiently identified.
A significant improvement in cognitive processing can be achieved. In addition, even if you are dealing with complex character patterns such as kanji, handwritten characters, or even printed characters, you can use the simple method described below. Since it can be easily obtained, it can have a remarkable effect of high realism in character recognition processing.

〔発明の実施例〕[Embodiments of the invention]

以下、図面を参照して本発明の一実施例につき説明する
Hereinafter, one embodiment of the present invention will be described with reference to the drawings.

第3図は本装置における入力文字パターンの特徴抽出処
理について示すものであり、ここでは手書き入力された
「天」なる文字パターンが示される。特徴抽出処理は、
先ず与えられた文字ノセターンについて、これを囲む外
接枠を検出することから行われる。この外接枠は、文字
パターンの最左端に接する在外接辺L1同じく最右端に
接する右外接辺R1そして文字パターンの最上端に接す
る上昇接辺Uと最下端に接する下外接mDをそれぞれ検
品することによって求められる。冑、上記各外接辺U、
D、L、Rは文字パターンを走査し、文字線の位置座標
の限界値を計算することにより′容易に求められる・し
かるのち、求められメ各外接辺U、D、L。
FIG. 3 shows the feature extraction process of the input character pattern in this apparatus, and here the character pattern "天" input by hand is shown. The feature extraction process is
First, for a given character nosetan, a circumscribing frame surrounding it is detected. For this circumscribing frame, inspect the circumscribed edge L1 that touches the leftmost edge of the character pattern, the right circumscribed edge R1 that also touches the rightmost edge, the rising tangent U that touches the top edge of the character pattern, and the lower circumscription mD that touches the bottom edge. It is determined by helmet, each of the above circumscribed sides U,
D, L, and R can be easily determined by scanning the character pattern and calculating the limit values of the position coordinates of the character lines.

Rを基準位置(基準線)として、各外接辺U。Each circumscribed side U, with R as the reference position (reference line).

D、L、Rから文字パターン中芯に向う方向にそれぞれ
走査し、文字線に到達する迄の走査線長を求める。但し
、上記走査の最大炎を文字パターンの大きさに応じた一
定の割合とすることが信号処理上好ましい。そして、こ
れらの走査線長間の差の総和を各外接辺U、D、L、H
にそれぞれ対応して算出し、この値を文字パターンの背
景部と文字線部の境界線長として検出する。例えば外接
辺りにおける境界線の長さlLは、第3図に示すように
最大炎をhH(h<1)に規定し、Lを基準位置として
文字パターンの中心に向う方向に走査し、個々の長さを
求める。
Scanning is performed from D, L, and R in the direction toward the core of the character pattern, and the length of the scanning line until reaching the character line is determined. However, from the viewpoint of signal processing, it is preferable that the maximum flame of the above-mentioned scanning be set at a constant ratio depending on the size of the character pattern. Then, the sum of the differences between these scanning line lengths is calculated as each circumscribed side U, D, L, H.
, and this value is detected as the boundary line length between the background part and the character line part of the character pattern. For example, the length LL of the boundary line around the circumference is determined by defining the maximum flame as hH (h<1) as shown in Figure 3, and scanning in the direction toward the center of the character pattern with L as the reference position. Find the length.

但し、走査線が文字線に到達しなかったときには前記長
さhHに達した。ときにその走査を終ると誓る。そして
、このようにして求められた走査線について、第4図の
如く隣接する走査線間の長さの差(絶、′対値)を加算
しさらに文字の縦幅Vを加える。1このようにして求め
られる外接辺りに接する輪郭線長lL(図4の太線の長
さ)は文字パターンの文字線の幅や文字の傾論や位置ず
れに対して安定化されに広域に亘る特徴量である。しか
して、同様な処理により、他の外接辺R,U、Dにそれ
ぞれ接する背景部についてもその輪郭線長を求める。
However, when the scanning line did not reach the character line, the length hH was reached. I promise to finish the scan. Then, for the scanning lines obtained in this manner, the difference in length (absolute, relative value) between adjacent scanning lines is added as shown in FIG. 4, and then the vertical width V of the character is added. 1 The contour line length lL (the length of the thick line in Fig. 4) that touches the circumference obtained in this way is stabilized against the width of the character line of the character pattern, the inclination of the character, and the positional shift, and covers a wide range. It is a feature quantity. Then, by similar processing, the contour line lengths of the background portions that are in contact with the other circumscribed sides R, U, and D, respectively, are determined.

その後、これらの値を文字パターンの外接枠の大きさ、
つまり縦幅および横幅について正規化した前記境界線長
を         □として求める。但し、上式中H
,Vは文字パターン外接枠の横幅および縦幅を示してい
る。これによって、入力された文字パターンの周囲部の
特徴を示す情報か、境界線長の組(CL* C”eCυ
、C勤)として求められる。この情報は、文字パターン
の外接辺U、D、L、Rから見た外形部の平均的凹凸度
合、言いかえれば文字周囲部の複雑さに相当したもので
あり、従って文字パターンを大分類識別する上で文字パ
ターンが有する特徴を十分反映したものとなっている。
Then, use these values as the size of the circumscribing frame of the character pattern,
In other words, the boundary line length normalized with respect to the vertical and horizontal widths is determined as □. However, in the above formula H
, V indicate the horizontal and vertical widths of the character pattern circumscribing frame. As a result, information indicating the characteristics of the surrounding area of the input character pattern or a set of boundary line lengths (CL* C”eCυ
, C shift). This information corresponds to the average degree of unevenness of the outer shape seen from the circumscribed sides U, D, L, and R of the character pattern, or in other words, the complexity of the surrounding area of the character, and therefore, the character pattern can be broadly classified. This sufficiently reflects the characteristics of the character pattern.

すなわちこのようにして求められた文字パターンの周囲
部の特徴(CL、Cm、Ctr、CD)を用いることに
より、上記文字パターンを効果的に、且つ安定確実に大
分類識別することが可能となる。        − 例えば第S図(a) 、 (b) 、 (C)にそれぞ
れ示されるように手書きされた様々な形の文字「古」が
入力された場合、上述した処理によりその文字パターン
上辺部の境界線長et  、t、2.、tsを求めれば
、これらはいずれも第5図(d)の形状のパターンの1
.(=Ha +hv、)を計数することに一致する。従
って、これらの特徴情報を文字の大きさで正規化すれば
、Hと■の比が同一である限り、特徴値は同一となる。
That is, by using the characteristics (CL, Cm, Ctr, CD) of the surrounding area of the character pattern obtained in this way, it becomes possible to effectively, stably and reliably classify and identify the character pattern. . - For example, when the handwritten characters ``古'' in various shapes as shown in Figure S (a), (b), and (C) are input, the boundary of the upper side of the character pattern is Line length et, t, 2. , ts, these are all 1 of the pattern of the shape shown in Fig. 5(d).
.. This corresponds to counting (=Ha +hv,). Therefore, if these feature information are normalized by the character size, the feature values will be the same as long as the ratio of H and ■ is the same.

すなわち(CL、CR,CU、CD)は文字の変形に対
して安定化された特徴であると言える。
That is, it can be said that (CL, CR, CU, CD) are features that are stabilized against character deformation.

第6図はこのような処理を施して文字パターンの大分類
識別の為の特徴を抽出する本発明の−実施例装置を示す
概略構成図である。
FIG. 6 is a schematic diagram showing an apparatus according to an embodiment of the present invention, which performs such processing to extract features for broad classification identification of character patterns.

読取り対象セある文字パターン1は、例えばテレビジョ
ンカメラ勢からなる走−光電変換装置2により光電変換
され、文字面を走査して入力される。この光電変換装置
2を介して入力された文字パターン1の像信号は2値量
子化懺置3に入力され、例えば背景澁度を基準として定
められた弁別レベルにて弁別されて2値量子化される。
A character pattern 1 to be read is photoelectrically converted by a scanning-photoelectric conversion device 2, which is made up of, for example, a television camera, and is input by scanning the character surface. The image signal of the character pattern 1 inputted through the photoelectric conversion device 2 is inputted to the binary quantization device 3, and is discriminated at a discrimination level determined based on the background level, for example, and converted into binary quantization. be done.

そして、量子化されてなる文字パターン信号は、走査位
置に対応する2値画素信号として1フレームメモリ等の
パターン配憶装fi114に記憶される。この記憶装置
4に記憶j格納された文事パターン画素信号が所定方向
に走査して読出され、前述した特徴抽出処還に供される
The quantized character pattern signal is then stored in a pattern storage device fi114 such as a one-frame memory as a binary pixel signal corresponding to the scanning position. The literary pattern pixel signals stored in the storage device 4 are scanned in a predetermined direction and read out, and are subjected to the feature extraction process described above.

即ち、外接枠検出装置1は、上記記憶装置4に格納され
た文字パターンの文字線の位置座標をサーチし、その最
左端m*xL、その最古端座lllX11を求め、在外
接辺りをt、mXLとして、鵞た在外接辺RをRa”X
 liとして求めている。
That is, the circumscribing frame detection device 1 searches for the positional coordinates of the character line of the character pattern stored in the storage device 4, finds its leftmost end m*xL, its oldest end position llll , mXL, let the extra circumscribed tangent R be Ra”X
I'm looking for it as a li.

同時に文字パターンの文字線の最上端座標YtFと最下
端座@ l’ Dを求め、上昇接辺Uと下外接辺りを求
めている。そして、外接枠検出装置5は、これらの外接
辺り、R,U、Dの情報を走査回路#L、11.#υ、
6Dに制御情報としてそれぞれ与えている・しかして、
各走査回路gla、6島、6U、6Dでは、与えられた
外接辺の情報から走査開始の基準線を定め、前記記憶装
置4に格納された文字パターンの信号を上記基準線から
文字パターンの中心に向う方向に、(左→右)、(右→
左)、(上→下)、(下→上)へと順次走査し、各外接
辺に対応した(接した)走査線長を求めている。このよ
うにして求められた走査線の!報は境界線計算部FL。
At the same time, the uppermost coordinate YtF and the lowermost coordinate @l'D of the character line of the character pattern are determined, and the ascending tangent U and the lower circumscription are determined. Then, the circumscribing frame detection device 5 sends information about these circumscribing areas, R, U, and D to scanning circuits #L, 11 . #υ、
Each is given as control information to 6D.
In each of the scanning circuits gla, 6 islands, 6U, and 6D, a reference line for starting scanning is determined from the information on the given circumscribed side, and the signal of the character pattern stored in the storage device 4 is transferred from the reference line to the center of the character pattern. In the direction of (left → right), (right →
(left), (top → bottom), and (bottom → top) to find the scanning line length corresponding to (contacting) each circumscribed side. The scanning line obtained in this way! The information is from the boundary line calculation department FL.

1島、7υ、1塾に入力され、前述したようにその走査
線長め差の総和1b、In、lυ、Inをそれぞれ求−
二これを正規化して相対的な量であるC・ j’i:、
・、。・、C・が求められるこ”□゛・:。
1 island, 7 υ, and 1 school, and as mentioned above, calculate the sums 1b, In, lυ, and In of the scanning line length differences, respectively.
2) Normalize this to get the relative quantity C j'i:,
・、.・、C・is required"□゛・:.

れらの情報が統括され、入力文字パターンの周囲部の特
徴情報(CL、CB、Co、Co)として特徴比較装置
Iに与えられる。
This information is integrated and given to the feature comparison device I as feature information (CL, CB, Co, Co) around the input character pattern.

特徴辞書−には、予め認識の対象となる各文字に対する
上記したような特徴情報が登録されており、°特徴□比
較装置8はこれらの辞書特徴と前配求められた入力文字
パターンの特徴との類似性(類似度)を順次計算してい
る。そして、その類似度か、所定の許容値−以上のとき
、これを得た特徴辞書のカテゴリーを大分類識別結果と
して出力゛している。
In the feature dictionary, the above-mentioned feature information for each character to be recognized is registered in advance, and the feature comparison device 8 compares these dictionary features with the features of the input character pattern obtained. The similarity (degree of similarity) between the two is calculated sequentially. Then, when the degree of similarity is greater than or equal to a predetermined tolerance value, the category of the feature dictionary obtained is outputted as a major classification identification result.

このように、本発明に係る4I像抽出と、これに基づく
大分類識′別を行う実施例装置は非常に簡単に実現でき
る。そして、このようにして得られた大分類識別結果と
、入力文字パターンの像信号とを次段の文字iim部に
与えれば、その文字gmを簡易に且つ効率良く行う仁と
が可能となる。しかも上述したように手書文字等の文字
パターンの変形や糧々の雑音に対して安定なので、認識
#!&理の効率を著しく向上せしめ得る。
As described above, the embodiment of the apparatus for extracting the 4I image and performing the broad classification identification based on the extraction of the 4I image according to the present invention can be realized very easily. Then, by applying the thus obtained large classification identification result and the image signal of the input character pattern to the next stage character iim section, it becomes possible to easily and efficiently perform the character gm. Moreover, as mentioned above, it is stable against deformation of character patterns such as handwritten characters and noise, so recognition #! & It can significantly improve the efficiency of the process.

従って、漢字を含む多くの文字を認識対象とする実用性
の高い文字読取り認識システムを構築できる等の絶大な
る効果を奏する。
Therefore, it is possible to construct a highly practical character reading recognition system that recognizes many characters including Chinese characters, and has great effects.

宵、本発明は上記した実施例にのみ限定されるものでは
ない。例えば、実施例では境界線長を隣接する走査線の
長さの差の総和として求めたが、類似効果を得るものと
して文字部の輪郭を直接追跡して境界線長を求めるよう
にしてもよい。このようにすれば、データ処理量は増え
るがより詳細また文字線や背景部正確な境界線長を得る
ことができる。特徴情報等も併さて大分類識別を行うよ
うにしてもよい。要するに本発明はその1旨を逸脱しな
い範囲で種々変形して実施することかできる。
However, the present invention is not limited to the embodiments described above. For example, in the embodiment, the boundary line length was determined as the sum of the length differences between adjacent scanning lines, but a similar effect may be obtained by directly tracing the outline of the character area to determine the boundary line length. . In this way, although the amount of data processing increases, it is possible to obtain more detailed and accurate border line lengths for character lines and background parts. Major classification identification may also be performed using feature information and the like. In short, the present invention can be implemented with various modifications without departing from its scope.

【図面の簡単な説明】[Brief explanation of the drawing]

第、1図および第2図は従来の文字パターンの%像抽出
の概念を説明する為の図、第3図は本発明に係る文字パ
ターンの特徴抽出の概念を説明する為の図、第4図は本
発明による文字バタ本発明の=実施例装置の要部概略構
成図である。 1、・・・文字、パターン、2・・・走査光電変換装置
、3・・・2値at化装置、4・・・パターン記憶装着
、6・・・外接枠検出装置、6L、6%、6υ、6D・
・・走査回路、r L e F l e rυ、71D
・・・境界線計算部、8・・・特徴比較装置、り・・・
特徴辞書。 出願人代理人 弁理士 鈴 江 武 彦1 ・□)。 、:。 第1図 第2図
1 and 2 are diagrams for explaining the concept of conventional character pattern % image extraction, FIG. 3 is a diagram for explaining the concept of character pattern feature extraction according to the present invention, and Figure 4 is a diagram for explaining the concept of character pattern feature extraction according to the present invention. The figure is a schematic configuration diagram of the main parts of the character butterfly device according to the embodiment of the present invention. 1,... Character, pattern, 2... Scanning photoelectric conversion device, 3... Binary at conversion device, 4... Pattern memory attachment, 6... Circumscribing frame detection device, 6L, 6%, 6υ, 6D・
・・Scanning circuit, r L e F l e rυ, 71D
...Boundary line calculation unit, 8...Feature comparison device, Ri...
Feature dictionary. Applicant's agent: Patent attorney Takehiko Suzue 1 ・□). , :. Figure 1 Figure 2

Claims (2)

【特許請求の範囲】[Claims] (1)#文字パターンを光電−換して入力する手盛と、
この入力された原文字パターンを2値量子化して配−す
る手段と、こ□の記憶された原文字パターンの上下左右
4方向の外接辺からなる外接枠を検出する手段と1.上
記各外接辺から原文字中心に向う方向の文字背景部の輪
郭線を検出する手段と、これらの各外接辺に対応した一
字背景部の輪郭線長の前記外接枠に対する相対長さから
前記原文字パターン周囲の平均的凹凸量−を検出する手
段と、この検出された平均的凹凸量から前記原文字パタ
ーンの大分類識別を行う手段とを具備したことを4I像
とする文字読取り装置。
(1) A manual that inputs the # character pattern by photoelectric conversion,
1. means for binary quantizing and arranging the input original character pattern; means for detecting a circumscribing frame consisting of circumscribing edges in four directions, top, bottom, left and right of the stored original character pattern; 1. Means for detecting the contour line of the character background part in the direction from each circumscribed side toward the center of the original character; A character reading device having a 4I image, comprising means for detecting an average amount of unevenness around an original character pattern, and means for broadly classifying the original character pattern from the detected average amount of unevenness.
(2)  文字背景部の輪郭線長を検出する手段は、各
外接辺から原文字パターンの中心に向う方向に走査して
、外接辺から文字線に到るまでの走査距離の差を集積し
て行われる特許請求の範囲第1項記載の文字読取り装置
(2) The means for detecting the length of the contour line of the character background section scans from each circumscribed side toward the center of the original character pattern, and accumulates the difference in scanning distance from the circumscribed side to the character line. A character reading device according to claim 1.
JP57046749A 1982-03-24 1982-03-24 Character reader Granted JPS58165178A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP57046749A JPS58165178A (en) 1982-03-24 1982-03-24 Character reader

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP57046749A JPS58165178A (en) 1982-03-24 1982-03-24 Character reader

Publications (2)

Publication Number Publication Date
JPS58165178A true JPS58165178A (en) 1983-09-30
JPH031712B2 JPH031712B2 (en) 1991-01-11

Family

ID=12755974

Family Applications (1)

Application Number Title Priority Date Filing Date
JP57046749A Granted JPS58165178A (en) 1982-03-24 1982-03-24 Character reader

Country Status (1)

Country Link
JP (1) JPS58165178A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS62281096A (en) * 1986-05-30 1987-12-05 Canon Inc Character recognition device
JPS6436387A (en) * 1987-07-31 1989-02-07 Toyota Central Res & Dev Character recognition device
JPS6436389A (en) * 1987-07-31 1989-02-07 Toyota Central Res & Dev Standard character pattern forming device

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS62281096A (en) * 1986-05-30 1987-12-05 Canon Inc Character recognition device
JP2578767B2 (en) * 1986-05-30 1997-02-05 キヤノン株式会社 Image processing method
JPS6436387A (en) * 1987-07-31 1989-02-07 Toyota Central Res & Dev Character recognition device
JPS6436389A (en) * 1987-07-31 1989-02-07 Toyota Central Res & Dev Standard character pattern forming device

Also Published As

Publication number Publication date
JPH031712B2 (en) 1991-01-11

Similar Documents

Publication Publication Date Title
US5077805A (en) Hybrid feature-based and template matching optical character recognition system
US9158986B2 (en) Character segmentation device and character segmentation method
KR19980023917A (en) Pattern recognition apparatus and method
JP2002133426A (en) Ruled line extracting device for extracting ruled line from multiple image
Shafait et al. Layout analysis of Urdu document images
Lue et al. A novel character segmentation method for text images captured by cameras
JPS58165178A (en) Character reader
Ullmann Picture analysis in character recognition
JP2797848B2 (en) Optical character reader
Huang et al. Scene character detection and recognition with cooperative multiple-hypothesis framework
Ting et al. A syntactic business form classifier
JP4159071B2 (en) Image processing method, image processing apparatus, and computer-readable recording medium storing program for realizing the processing method
Yuan et al. Page segmentation and text extraction from gray-scale images in microfilm format
KR910000786B1 (en) Pattern recognition system
JPS6120036B2 (en)
CN112183538B (en) Manchu recognition method and system
JPH06139338A (en) Fingerprint pattern classifying device
JPH0324709B2 (en)
JP3163698B2 (en) Character recognition method
JP2580976B2 (en) Character extraction device
JP2715930B2 (en) Line detection method
JP2918363B2 (en) Character classification method and character recognition device
JPH03126188A (en) Character recognizing device
JP2832035B2 (en) Character recognition device
Sadri et al. Statistical characteristics of slant angles in handwritten numeral strings and effects of slant correction on segmentation