JPS5852783A - Feature extraction system - Google Patents

Feature extraction system

Info

Publication number
JPS5852783A
JPS5852783A JP56151051A JP15105181A JPS5852783A JP S5852783 A JPS5852783 A JP S5852783A JP 56151051 A JP56151051 A JP 56151051A JP 15105181 A JP15105181 A JP 15105181A JP S5852783 A JPS5852783 A JP S5852783A
Authority
JP
Japan
Prior art keywords
character
character pattern
line segment
extracted
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP56151051A
Other languages
Japanese (ja)
Other versions
JPH0139156B2 (en
Inventor
Michiaki Nakanishi
道明 中西
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to JP56151051A priority Critical patent/JPS5852783A/en
Publication of JPS5852783A publication Critical patent/JPS5852783A/en
Publication of JPH0139156B2 publication Critical patent/JPH0139156B2/ja
Granted legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/18Extraction of features or characteristics of the image
    • G06V30/182Extraction of features or characteristics of the image by coding the contour of the pattern
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Abstract

PURPOSE:To perform character recognition accurately and speedily by extracting segments forming the contour of a character pattern part hided behind a character pattern part, and using them as character features. CONSTITUTION:A character pattern is scanned horizontally or vertically and segments forming the contour of a character pattern are extracted in the form of a set of black-white change points Q1-Q4 between the background and character and used as character features for character recognition. Further, a segment l6 forming the contour of a character pattern part hided behind a character pattern part appearing firstly in the horizontal or vertical scanning direction is extracted in the form of a set of odd-numbered black-white change points Q3... except the 1st change point, and this is used as character features.

Description

【発明の詳細な説明】 本分明線文字認識に用いる文字特徴の抽出方式に関し、
従来法によりては隠れてしまう部分の特徴抽出を可能に
しようとするものである。
[Detailed Description of the Invention] The present invention relates to a character feature extraction method used for bright line character recognition.
This method attempts to enable feature extraction of parts that are hidden by conventional methods.

文字−繊においては、文字を左方から見え特徴、上方か
ら見た特徴を求め、文字認識の資料に供するということ
が行なわれる。例えば第1図に示す数字3を例にとると
、手書きした咳数字3をラインスキャナなどで走査して
2値ビンオ信号を得、一旦メモリに格納し、蚊メモリか
ら数字3のビデオ信号を切出しく一般には他の手書き文
字等と共に走査されて骸メモリに格納されているから)
、左から右へ走査する各水平を走査上で最初に0(背景
)から1(文字)に変換する点の集まシを求め、それを
左方から見た蚊数字3の輪郭特徴とする。
In character-texture, the characteristics of characters viewed from the left and the characteristics viewed from above are determined and used as data for character recognition. For example, taking the number 3 shown in Figure 1, the handwritten cough number 3 is scanned with a line scanner to obtain a binary signal, which is temporarily stored in memory, and the video signal of number 3 is cut out from the mosquito memory. Generally, it is scanned together with other handwritten characters and stored in the memory)
, first find a collection of points that convert from 0 (background) to 1 (character) on each horizontal scan from left to right, and use this as the contour feature of the mosquito number 3 seen from the left. .

この論理では線分’!1 r 14e ABが左方から
見た文字の特徴(ζ\では、左方特徴という)となる。
In this logic, line segment'! 1 r 14e AB is the feature of the character seen from the left (referred to as the left feature in ζ\).

同様に上から下へ走査する各垂直走査線上で最初に0か
ら1に変る点の集りを求めると第2図の―分14 * 
z=が得られ、これが上方から見た文字の慢徴(上方特
徴)となる。これらの左方特徴、上方特徴および同様に
して求めた右方%徴、下方特徴は抽出が容易で、しかも
数字Ω、1,2・・・・・・・などの単純なパター/の
文字に対してはよい特徴データとなり、文字gaに効果
的に利用できる。例えば上記の線分t1とt3の間に人
や込んだ線分4があるというものは0〜9の数字では6
以外にはなく、従りて対象は0〜9のいずれかというこ
とで、Toればこの特徴1つだけでも数字3の認識が可
能である。しかし、陰に隠れて抽出されない特徴も多々
ある。例えば左方特徴では、大きく凹んでいる部分P1
*P!などは抽出されておらず、上方*微では抽出され
るのはいわば屋根又は庇になっている部分のみでその下
の全部が隠れてしまって抽出されない。従って漢字はお
ろかカナなどの少し複雑なパターンの文字では特徴抽出
不充分となって、他のものと分離、識別できない文字が
発生する。
Similarly, if we find the group of points that first change from 0 to 1 on each vertical scanning line scanning from top to bottom, we get -14* in Figure 2.
z= is obtained, which becomes the characteristic feature (upper feature) of the character when viewed from above. These left features, upper features, right % features, and lower features obtained in the same way are easy to extract, and can be easily extracted from simple patterns such as the numbers Ω, 1, 2, etc. It becomes good feature data for the character ga, and can be effectively used for the character ga. For example, there is a line segment 4 with a person between the line segments t1 and t3 above, which is 6 in the numbers 0 to 9.
Therefore, since the target is any number from 0 to 9, it is possible to recognize the number 3 with just this feature alone. However, there are many features that are hidden in the background and are not extracted. For example, in the left feature, the large concave part P1
*P! etc. are not extracted, and in the upper part, only the part that is a roof or eaves is extracted, and everything below it is hidden and not extracted. Therefore, the feature extraction is insufficient for characters with slightly complex patterns such as kanji and kana, and some characters cannot be separated or identified from other characters.

そこで本発明は陰に隠れる部分の特徴抽出も行なえるよ
うにして正確、迅速な文字&I識を9臓にしようとする
ものである。即ち本分明社文字パターンを水平又は垂直
方向に走査し、背景と文字との白黒変化点の果りとして
、該文字パターンの輪郭を構成する線分を取出し、それ
を文字−繊に用いる文字特徴とする%徴抽出方式におい
て、谷水乎、垂直走査線方向における、1番目を除く奇
数qIt目の白黒変化点の集りとして、走査方向で最初
に現われる文字パターン部分の陰に隠れた文字ノくター
/S分の輪郭をなす線分を抽出し、それを前記文字特徴
とすることを特徴とするが、次に図面を参照しながらこ
れを詳細に説明する。
Therefore, the present invention aims to make accurate and quick character & I recognition possible by making it possible to extract features of hidden parts. That is, a Honbunmeisha character pattern is scanned horizontally or vertically, line segments forming the outline of the character pattern are extracted as a result of black-and-white transition points between the background and the characters, and these are used as character features for the characters. In the percentage feature extraction method, Tani Mizuyuki, the character mark hidden behind the character pattern part that first appears in the scanning direction is defined as a collection of odd-numbered qIt-th black-and-white transition points in the vertical scanning direction, excluding the first one. The present invention is characterized in that a line segment forming an outline of /S is extracted and used as the character feature, which will be described in detail below with reference to the drawings.

前記の線分LlpL*・−m−は次のようにして認識さ
れる。即ち走査線Zl) e zly 4はY座標が異
なるからそのY座標別に、0→1反転を生じる最初の点
の座111(X座411)を求めると(Y6 *O)+
(Yi +0)t(Y鵞pO) v (Ys exs 
) e (Y41X4’) ””・の如きデータ群が得
られる。こ\で0は′″0→1011反転点は無い″を
示すが、このようなものも単純にXl(i=o 、 1
 、2・・・・・)で示すと、上記データは(Xi 、
 Yl )で表わされる。これらのデータはメモリのY
iアドレスにXげ一層を記録するという方法をとると、
処理が容易である。即ちアドレスカウンタを逐次+1し
てy、 l y、 l yj・・・・・・アドレスのデ
ータ為、xl 、x、・・・・・・を続出し、ΔX=X
l+1−Xi  を求めてみると、線分t1e z、 
14においては差分ΔXは小さいからこの事を以りて連
続した線分であると判定できる。またjX〈0なら左肩
下りの線分、jX>0なら右肩下りの線分と言える。
The line segment LlpL*·-m- is recognized as follows. That is, scanning line Zl) e zly 4 has different Y coordinates, so if we find the first point 111 (X 411) that causes 0→1 inversion for each Y coordinate, we get (Y6 *O)+
(Yi +0)t(Y鵞pO) v (Ys exs
) e (Y41X4') A data group such as "" is obtained. Here, 0 indicates ``There is no 0→1011 reversal point'', but such a thing can also be simply expressed as Xl(i=o, 1
, 2...), the above data is (Xi,
Yl). These data are stored in memory
If you take the method of recording the X number in the i-address,
Easy to process. That is, the address counter is sequentially +1 and y, l y, l yj...... Because of the address data, xl, x, ...... are successively generated, and ΔX=X
When calculating l+1-Xi, the line segment t1e z,
14, the difference ΔX is small, so it can be determined from this that it is a continuous line segment. Also, if jX<0, it is a line segment going down the left shoulder, and if jX>0, it is a line segment going down the right shoulder.

線分t1とZI F L鵞と1.0境ではjxは突然大
になる。このような場合線分性途切れている、少なくと
も水平走査線と平行な線分で接続されているに過ぎない
と言える。線分の端を決定するのはこの不連続点と、X
l−0からある値を持つようになった点である。線分t
1の始端は後者、終端は前者であり、線分t1は両端が
前者、線分t3は始端が前者、終端が後者である。そし
て線分り冨のように両端が不連続点ということは、数字
0〜9のように1つにつながったものにおいては、両端
に文字パターン部分がある、両端が閉じていると言える
。仁れに対して線分t1.tsは両燗が開放していると
言える。両端クローズの線分を持つということは、−述
のように数字3の大きな特徴点である。
At the 1.0 boundary between line segment t1 and ZI F L, jx suddenly becomes large. In such a case, it can be said that the linearity is interrupted, or at least connected by a line segment parallel to the horizontal scanning line. This discontinuity point and X
This is the point where it has a certain value from l-0. line segment t
The starting end of 1 is the latter, and the ending is the former, the line segment t1 has the former at both ends, and the starting end of line segment t3 is the former, and the ending is the latter. The fact that both ends are discontinuous points, such as the line segment depth, means that for numbers 0 to 9, which are connected into one, there are character pattern portions at both ends, and both ends are closed. Line segment t1. It can be said that both sides of ts are open. Having a line segment with both ends closed is a major feature of the number 3, as mentioned above.

ところで数字Sには凹みPlpP雪があるから、これを
も検出すると、数字6の特徴を一層よりよく抽出したこ
とになる。この凹み部分は線分Zl e LMの陰にな
っているので抽出できなかったものであるが、抽出論理
を「最初の0→1変化点」ではなく[3番目の(一般化
すれば奇数番目の)0→1変化点」とすると、隘になり
fr:、部分を抽出できる。
By the way, the number S has a concave PlpP snow, so if this is also detected, the feature of the number 6 will be extracted even better. This concave part could not be extracted because it is in the shadow of the line segment Zl e LM, but the extraction logic is not the "first 0 → 1 change point" but the third (or odd numbered point if generalized). ) 0 → 1 change point", then fr:, the part can be extracted.

即ち第6図に示すように凹んだ部分P1を通る走査線t
aKついて0−1反転をみるとそれはQt t Ql 
#Ql、Q4の4点であり、陰れ先部分t6の輪郭を定
める反転点Q3は6番目である。この「3番目の反転点
」の論理で線分t@n14を抽出でき、これと第1図の
方式つまり「1番目の反転点」の論理で求めた線分L−
を合せると凹部の最深部まで入シ込んだ一分4が得られ
る0か\る一分t8を用いると数字「6」のg繊は一層
確実、容易になる。即ちこの一分のjXを求めるとそれ
は正、負、正、負と変り、数字6の特徴をよく表わして
いる。か\る線分t−と前記線分zt t Zlを組み
合せる、即ち垂直方向では線分L*eLseLsの順で
存在し、そして水平方向では線分4 t z、の右方に
あ)両端が該線分zt e z、と重なる一分4がある
という論理では、相当乱暴に手書きしたものでも数字6
を他のものと分離、識別できる。
That is, as shown in FIG. 6, the scanning line t passing through the concave portion P1
Looking at the 0-1 reversal for aK, it is Qt t Ql
There are four points, #Ql and Q4, and the reversal point Q3, which defines the outline of the shaded end portion t6, is the sixth. The line segment t@n14 can be extracted using the logic of this "third reversal point," and the line segment L- obtained using the method shown in Figure 1, that is, the logic of the "first reversal point"
If you use 1 minute t8, which is 0 or \, you will get 1 minute 4 that has entered the deepest part of the recess. In other words, when we find this fraction of jX, it changes from positive to negative to positive to negative, which clearly represents the characteristics of the number 6. Combining the line segment t- and the line segment zt t Zl, that is, the line segments exist in the order of L*eLseLs in the vertical direction, and on the right of the line segment 4 t z in the horizontal direction) both ends According to the logic that there is a 1/4 that overlaps with the line segment zt e z, even if it is handwritten very roughly, the number 6
can be separated and identified from others.

第4図は手書きの「チ」、第5図は手書きの「テ」の例
を示す。これらの相違点は突出部Rがあるか否かが唯一
の識別ポイントというケースも珍らしくない。しかしこ
れは上部の文字パターン部分Bに隠れているので、第1
図の最初の変化点という論理では抽出できない。これに
対して3番目の変化点、特に垂直走査線における3番目
の変化点という論理を用いると線分1m * zl・が
抽出できる(両端部は、「最初の変化点」で抽出したも
の)。線分2. + 21・が抽出できれば、「Y座標
変化が一様か」で(垂直方向をXとする)、突出部Rの
有無をチェックでき、ひいてはテとチの識別が可能にな
る。まぎられしい字は多々あり、例えば片仮名のつと力
、りとつ、ミとシ、りとン、二とンなども乱暴に薔かれ
九場合に単純な左方特徴、上方特徴などでは識別しにく
いものである0これらも「奇数番目」の論理で胸になる
部分を抽出すると、又はそれと単純な左方特徴等と組合
せると識別可能となることを期待できる。
FIG. 4 shows an example of a handwritten "chi", and FIG. 5 shows an example of a handwritten "te". It is not uncommon for these differences to be determined by the presence or absence of the protrusion R. However, this is hidden in the upper character pattern part B, so
It cannot be extracted using the logic of the first change point in the diagram. On the other hand, if we use the logic of the third change point, especially the third change point in the vertical scanning line, we can extract the line segment 1m * zl (both ends are extracted at the "first change point"). . Line segment 2. If +21· can be extracted, the presence or absence of the protrusion R can be checked by checking whether the change in the Y coordinate is uniform (the vertical direction is X), and it becomes possible to identify the tip and the tip. There are many characters that are confusing, for example, the katakana tsuto chikara, ritotsu, mi and shi, riton, and niton, etc., are randomly written and cannot be distinguished by simple left-hand or upper-hand features. It is expected that these difficult-to-identify breasts will be able to be identified by extracting the chest area using the "odd number" logic, or by combining it with a simple left-hand feature or the like.

手書き数字および仮名の認識においては特徴として例え
ば1000種など多数を用い、これらで50段程度のト
リー回路を構成し、それを逐っていくことによシ文字認
識を行なう。本発明によシ抽出する特徴もその1つに加
えて使用されるものである。なお特徴が適切であれば比
較的少数の段を逃るだけで結果を得ることができ、文字
認識速度を上げることができる。この点、本発明方式は
甚だ有効である。
In the recognition of handwritten numbers and kana, a large number of features, such as 1000 types, are used, and these constitute a tree circuit of about 50 stages, and character recognition is performed by running through the tree circuits. The features extracted according to the present invention are also used in addition to one of them. Note that if the features are appropriate, results can be obtained by missing a relatively small number of rows, and character recognition speed can be increased. In this respect, the method of the present invention is extremely effective.

【図面の簡単な説明】[Brief explanation of the drawing]

第1図および第2図は従来法の説明図、第6図〜第5図
は本発明法の説明図である。    ′図面でt1〜t
1・は線分、zs t t7 e zs e Zl・は
陰に隠れた文字パターン部分の線分である。 出 願 人 富士通株式会社 代理人弁理士 青 柳   稔 第1図    第2図 第3図 第4図    第す図
1 and 2 are explanatory diagrams of the conventional method, and FIGS. 6 to 5 are explanatory diagrams of the method of the present invention. 't1-t in the drawing
1. is a line segment, and zs t t7 e zs e Zl. is a line segment of a hidden character pattern part. Applicant Fujitsu Ltd. Representative Patent Attorney Minoru Aoyagi Figure 1 Figure 2 Figure 3 Figure 4 Figure S

Claims (1)

【特許請求の範囲】 文字パターンを水平又は垂直方向に走査し、背景と文字
との白黒変化点の集りとして、該文字パターンの輪郭を
構成する線分を取出し、それを文字認識に用いる文字特
徴とする特徴抽出方式において、 各水平、垂直走査線方向における、1番目を除く奇数番
目の白黒変化点の集りとして、走査方向で最初に現われ
る文字パターン部分の陰に隠れた文字パターン部分の輪
郭をなす線分を抽出し、それを前記文字特徴とすること
を特徴とした文字認識に用いる文字特徴の抽出方式。
[Claims] A character feature in which a character pattern is scanned horizontally or vertically, line segments forming the outline of the character pattern are extracted as a collection of black and white transition points between the background and characters, and the line segments are used for character recognition. In the feature extraction method, the outline of the character pattern part hidden behind the character pattern part that appears first in the scanning direction is defined as a collection of odd-numbered black-and-white transition points other than the first in each horizontal and vertical scanning line direction. A character feature extraction method used for character recognition characterized by extracting a line segment and using it as the character feature.
JP56151051A 1981-09-24 1981-09-24 Feature extraction system Granted JPS5852783A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP56151051A JPS5852783A (en) 1981-09-24 1981-09-24 Feature extraction system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP56151051A JPS5852783A (en) 1981-09-24 1981-09-24 Feature extraction system

Publications (2)

Publication Number Publication Date
JPS5852783A true JPS5852783A (en) 1983-03-29
JPH0139156B2 JPH0139156B2 (en) 1989-08-18

Family

ID=15510220

Family Applications (1)

Application Number Title Priority Date Filing Date
JP56151051A Granted JPS5852783A (en) 1981-09-24 1981-09-24 Feature extraction system

Country Status (1)

Country Link
JP (1) JPS5852783A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05214767A (en) * 1992-01-31 1993-08-24 Daiwa House Ind Co Ltd Column and column connecting structure

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05214767A (en) * 1992-01-31 1993-08-24 Daiwa House Ind Co Ltd Column and column connecting structure

Also Published As

Publication number Publication date
JPH0139156B2 (en) 1989-08-18

Similar Documents

Publication Publication Date Title
JP2940936B2 (en) Tablespace identification method
JPH0256707B2 (en)
JPS5852783A (en) Feature extraction system
JPH0981740A (en) Diagram input device
JP3130869B2 (en) Fingerprint image processing device, fingerprint image processing method, and recording medium
JP2506071B2 (en) Contour tracking device
JPH0522164B2 (en)
JP2575402B2 (en) Character recognition method
JPH09114925A (en) Optical character reader
JPS603676B2 (en) Intersection extraction method
JPH08212292A (en) Frame line recognition device
JPS58201183A (en) Feature extracting method of handwritten character recognition
JPS589471B2 (en) link link
JP2985981B2 (en) Binary image feature extraction method and image processing system using the same
JPH0310986B2 (en)
JPH0326878B2 (en)
EP0067236A1 (en) Character and figure isolating and extracting system
JPS5830621B2 (en) Intersection detection method
JPH02166583A (en) Character recognizing device
JPH02266478A (en) Method for recognizing drawing
JPH01156875A (en) Contour extracting system for binarized image
JPH03149676A (en) Method for recognizing image
JPH035630B2 (en)
JPH09167205A (en) Method for picking-up character pattern
JPS63182780A (en) Picture processing method for drawing reading device